• Post author:
  • Post category:Pandas
  • Post last modified:November 18, 2024
  • Reading time:11 mins read
You are currently viewing Pandas head() – Returns Top N Rows

Pandas header() function is used to get the top N rows from DataFrame or top N elements from a Series. When used negative number it returns all except the last N rows. This function is mainly used for testing to check if the object contains the right type of Data.

Advertisements

When you wanted to extract only top N rows after all your filtering and transformations, you can use the head() method, which is defined in the Pandas library.

Key Points –

  • The head() method is used to quickly return the first N rows of a DataFrame or Series.
  • head() function exists in Series and DataFrame.
  • It returns the same object as caller.
  • When no param is used on head(), by default returns top 5 rows.
  • Passing a negative value to n will return all rows except the last n rows.
  • If called on an empty DataFrame, it will return an empty DataFrame with the same columns.
  • The head() method does not alter the original DataFrame or Series; it only returns a view of the data.

head() Syntax

Following is the syntax of the head() method of the DataFrame and Series.


# Syntax of head() method
DataFrame.head(n)
Series.head(n)

Let’s create a DataFrame with Dict


# Create a pandas DataFrame
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Courses' :['Spark','Python','Java','C++','Hadoop','R','C#','AWS'],
    'Fee' :[22000,25000,23000,22000,30000,22000,32000,40000],
    'Duration':['30days','50days','30days','35days','40days','45days','50days','60days']
          })
print(df)

# Output:
#  Courses    Fee Duration
# 0   Spark  22000   30days
# 1  Python  25000   50days
# 2    Java  23000   30days
# 3     C++  22000   35days
# 4  Hadoop  30000   40days
# 5       R  22000   45days
# 6      C#  32000   50days
# 7     AWS  40000   60days

DataFrame.head() Example

Pandas DataFrame.head() method is used to get the top or bottom N rows of the DataFrame. When a positive number is used, it returns top N rows.

For negative numbers, it returns the rows except last N.

This function is mostly used for testing purposes to check if it contains the right data.

By default, without N value as param, it returns the top 5 rows.


# Default return 5 rows
print(df.head())

# Output:
#  Courses    Fee Duration
# 0   Spark  22000   30days
# 1  Python  25000   50days
# 2    Java  23000   30days
# 3     C++  22000   35days
# 4  Hadoop  30000   40days

To get the top 3 rows, use value 3 for N param.


# Top N rows
print(df.head(3))

# Output:
#  Courses    Fee Duration
# 0   Spark  22000   30days
# 1  Python  25000   50days
# 2    Java  23000   30days

Use negative numbers for N to get the rows except for N rows from the last. You can also achieve the same result by using df[:n]. The below example ignores the last 3 records and returns the remaining.


# Except last N rows
print(df.head(-3))

# Output:
#  Courses    Fee Duration
# 0   Spark  22000   30days
# 1  Python  25000   50days
# 2    Java  23000   30days
# 3     C++  22000   35days
# 4  Hadoop  30000   40days
# 5       R  22000   45days

Pandas Series.head() Example

Pandas series.head() function is used to get the top N elements. When used negative integer it returns elements except for the last N. Since each column of a DataFrame is a Series, I will use one column from above DataFrame to explain.

By default, without N value as param, it returns the top 5 elements.


# head() example
print(df['Fee'].head())

# Output:
# 0    22000
# 1    25000
# 2    23000
# 3    22000
# 4    30000
# Name: Fee, dtype: int64

Top N elements from Series


# Top N elements from Series
print(df['Fee'].head(3))

# Output:
# 0    22000
# 1    25000
# 2    23000
# Name: Fee, dtype: int64

To get except the last N elements, use the negative N value.


# Except last n rows
print(df['Fee'].head(-3))

# Output:
# 0    22000
# 1    25000
# 2    23000
# 3    22000
# 4    30000
# Name: Fee, dtype: int64

FAQ on Pandas head()

What does head() do in Pandas?

The head() function in Pandas is used to return the first N rows of a DataFrame or Series. By default, it returns the top 5 rows.

How do I use head() to view the first 10 rows of a DataFrame?

You can pass the number of rows you want to view as an argument to the head() function.

What is the default behavior of head() if no argument is provided?

If no argument is provided to head(), it defaults to returning the first 5 rows of a DataFrame or Series.

Can I use head() on a Pandas Series?

head() can be used on both DataFrames and Series. For a Series, it will also return the top N elements.

What happens if the number of rows specified is greater than the DataFrame’s length?

If the number of rows you request with head() exceeds the number of rows in the DataFrame, it will return the entire DataFrame without any errors.

Can I use head() to modify a DataFrame?

head() does not modify the DataFrame in place. It simply returns a new view of the top rows. If you want to assign the result to a new variable.

Conclusion

In this article, you have learned the syntax and usage of the head() function with examples. Also learned head() is used to get the top N elements and it is available in both DataFrame and Series. This returns the same object as the caller.

References