pandas head() – Returns Top N Rows

pandas header() function is used to get the top N rows from DataFrame or top N elements from a Series. When used negative number it returns all except the last N rows. This function is mainly used for testing to check if the object contains the right type of Data.

When you wanted to extract only top N rows after all your filtering and transformations, you can use the head() method, which is defined in the Pandas library.

pandas head() Key Points –

  • Reurns top N elements.
  • head() function exists in Series and DataFrame.
  • When no param is used on head(), by default returns top 5 rows.
  • Use negative number used, it ignores the last N rows.
  • It returns the same object as caller.

1. head() Syntax

Following is the syntax of the head() method of the DataFrame and Series.


# Syntax of head() method
DataFrame.head(n)
Series.head(n)

Let’s create a DataFrame with Dict


# Create a pandas DataFrame.
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Courses' :['Spark','Python','Java','C++','Hadoop','R','C#','AWS'],
    'Fee' :[22000,25000,23000,22000,30000,22000,32000,40000],
    'Duration':['30days','50days','30days','35days','40days','45days','50days','60days']
          })
print(df)

#Outputs
#  Courses    Fee Duration
#0   Spark  22000   30days
#1  Python  25000   50days
#2    Java  23000   30days
#3     C++  22000   35days
#4  Hadoop  30000   40days
#5       R  22000   45days
#6      C#  32000   50days
#7     AWS  40000   60days

2. DataFrame.head() Example

pandas DataFrame.head() method is used to get the top or bottom N rows of the DataFrame. When a positive number is used, it returns top N rows.

For negative numbers, it returns the rows except last N.

This function is mostly used for testing purposes to check if it contains the right data.

By default, without N value as param, it returns the top 5 rows.


# Default return 5 rows
print(df.head())

# Outputs
#  Courses    Fee Duration
#0   Spark  22000   30days
#1  Python  25000   50days
#2    Java  23000   30days
#3     C++  22000   35days
#4  Hadoop  30000   40days

To get the top 3 rows, use value 3 for N param.


# Top N rows
print(df.head(3))

# Outputs
#  Courses    Fee Duration
#0   Spark  22000   30days
#1  Python  25000   50days
#2    Java  23000   30days

Use negative numbers for N to get the rows except for N rows from the last. You can also achieve the same result by using df[:n]. The below example ignores the last 3 records and returns the remaining.


# Except last N rows
print(df.head(-3))

# Outputs
#  Courses    Fee Duration
#0   Spark  22000   30days
#1  Python  25000   50days
#2    Java  23000   30days
#3     C++  22000   35days
#4  Hadoop  30000   40days
#5       R  22000   45days

3. pandas Series.head() Example

pandas series.head() function is used to get the top N elements. When used negative integer it returns elements except for the last N. Since each column of a DataFrame is a Series, I will use one column from above DataFrame to explain.

By default, without N value as param, it returns the top 5 elements.


# head() example
print(df['Fee'].head())

# Outputs
#0    22000
#1    25000
#2    23000
#3    22000
#4    30000
#Name: Fee, dtype: int64

Top N elements from Series


# Top N elements from Series
print(df['Fee'].head(3))

#0    22000
#1    25000
#2    23000
#Name: Fee, dtype: int64

To get except the last N elements, use the negative N value.


# Except last n rows
print(df['Fee'].head(-3))

# Outputs
#0    22000
#1    25000
#2    23000
#3    22000
#4    30000
#Name: Fee, dtype: int64

Conclusion

In this article, you have learned the syntax and usage of the head() function with examples. Also learned head() is used to get the top N elements and it is available in both DataFrame and Series. This returns the same object as the caller.

You May Also Like Reading

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing pandas head() – Returns Top N Rows