• Post author:
  • Post category:Pandas
  • Post last modified:July 8, 2024
  • Reading time:13 mins read

In pandas, the head() method is used to return the first n rows of a DataFrame. By default, it returns the first 5 rows, but you can specify a different number of rows by passing an integer as an argument.

Advertisements

In this article, I will explain the Pandas DataFrame head() method by using its syntax, parameters, usage, and how to return a DataFrame or Series containing the initial n rows from the original object.

Key Points –

  • The head() method is used to quickly preview the first n rows of a DataFrame or Series.
  • By default, head() returns the first 5 rows of the DataFrame if no parameter is specified.
  • The method accepts a single optional parameter, n, which specifies the number of rows to return from the top of the DataFrame.
  • You can specify the number of rows to return by passing an integer as an argument to the method.
  • It returns a new DataFrame containing the first n rows, preserving the original DataFrame’s structure and data types.

Pandas DataFrame head() Introduction

Let’s know the syntax of the head() method.


# Syntax of Pandas DataFrame head()
DataFrame.head(n=5)

Parameters of the DataFrame head()

It allows only one parameter.

  • n (int, optional) – An integer specifying the number of rows to return. The default value is 5.

Return Value

It returns the first n rows of the DataFrame.

Usage of Pandas DataFrame head() Method

The Pandas DataFrame head() method is used to quickly inspect the beginning of a DataFrame or Series.

To run some examples of the Pandas DataFrame head() function, let’s create a Pandas DataFrame using data from a dictionary.


import pandas as pd

technologies= {
    'Courses':["Spark", "PySpark", "Hadoop", "Python", "Pandas", "PySpark", "Java"],
    'Fee' :[22000, 25000, 23000, 24000, 26000, 30000, 35000],
    'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
    'Duration':['35days', '35days', '40days', '30days', '25days', '50days', '45days']
          }

df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

pandas dataframe head

The default usage of the head() method in Pandas is to return the first 5 rows of a DataFrame when no argument is specified.


# Using the default usage of the head() method
print("First 5 rows of the DataFrame (default usage):\n", df.head())

Yields below output.

pandas dataframe head

Specifying Number of Rows

Alternatively, you can specify the number of rows to return with the head() method by passing an integer argument. This allows you to preview a custom number of rows from the beginning of a DataFrame.


# Specifying the number of rows to return
print("First 3 rows of the DataFrame:\n", df.head(3))

# Output:
# First 3 rows of the DataFrame:
#     Courses    Fee  Discount Duration
# 0    Spark  22000      1000   35days
# 1  PySpark  25000      2300   35days
# 2   Hadoop  23000      1000   40days

In the above example, By passing an integer argument to head(), you can specify the number of rows to return.

Displaying More Than 5 Rows

To display more than the default 5 rows using the head() method in Pandas, you can specify the desired number of rows by passing an integer argument to the method. This allows you to preview a larger portion of your DataFrame.


# Displaying the first 6 rows of the DataFrame
print("First 6 rows of the DataFrame:\n", df.head(6))

# Output:
# First 6 rows of the DataFrame:
#     Courses    Fee  Discount Duration
# 0    Spark  22000      1000   35days
# 1  PySpark  25000      2300   35days
# 2   Hadoop  23000      1000   40days
# 3   Python  24000      1200   30days
# 4   Pandas  26000      2500   25days
# 5  PySpark  30000      2000   50days

By specifying an integer argument greater than 5, you can view more rows from the beginning of your DataFrame.

Using head() on a DataFrame with DateTime Index

Similarly, using the head() method on a DataFrame with a DateTime index works the same way as with any other DataFrame. This method will return the first n rows of the DataFrame, which is especially useful for time series data where you often need to inspect the initial entries.


import pandas as pd

# Creating a date range
dates = pd.date_range('2024-06-01', periods=7)

# Creating the DataFrame with DateTime index
data = {
    'Temperature': [22, 21, 19, 23, 20, 18, 21],
    'Humidity': [55, 60, 58, 57, 59, 61, 62]
}

df = pd.DataFrame(data, index=dates)
print("Create DataFrame with DateTime index:\n", df)

# Using head() method to display the first 4 rows
print("First 4 rows of the DataFrame:\n", df.head(4))

# Output:
# Create DataFrame with DateTime index:
#              Temperature  Humidity
# 2024-06-01           22        55
# 2024-06-02           21        60
# 2024-06-03           19        58
# 2024-06-04           23        57
# 2024-06-05           20        59
# 2024-06-06           18        61
# 2024-06-07           21        62

# First 4 rows of the DataFrame:
#              Temperature  Humidity
# 2024-06-01           22        55
# 2024-06-02           21        60
# 2024-06-03           19        58
# 2024-06-04           23        57

Using head() on a DataFrame with a DateTime index returns the first n rows while maintaining the DateTime index.

Using head() on a DataFrame with a MultiIndex

Finally, using the head() method on a DataFrame with a MultiIndex is similar to using it on a regular DataFrame. The head() method will return the first n rows of the DataFrame, including all levels of the MultiIndex.


import pandas as pd

# Creating a MultiIndex
technologies= [
     ["Spark", "Spark", "PySpark", "PySpark", "Python", "Pandas", "Pandas"],
     [22000, 25000, 23000, 24000, 26000, 30000, 35000],
   ] 
   
index = pd.MultiIndex.from_arrays(technologies, names=('Courses', 'Fee'))

# Creating the DataFrame with MultiIndex
data = {
    'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
}

df = pd.DataFrame(data, index=index)
print("Create DataFrame with MultiIndex:\n", df)

# Using head() method to display the first 4 rows
print("First 4 rows of the DataFrame:\n", df.head(4))

# Output:
# Create DataFrame with MultiIndex:
#                 Discount
# Courses Fee            
# Spark   22000      1000
#         25000      2300
# PySpark 23000      1000
#         24000      1200
# Python  26000      2500
# Pandas  30000      2000
#         35000      2200

# First 4 rows of the DataFrame:
#                 Discount
# Courses Fee            
# Spark   22000      1000
#         25000      2300
# PySpark 23000      1000
#         24000      1200

FAQ on Pandas DataFrame head() Function

What is the head() function in Pandas?

The head() function in Pandas is used to return the first n rows of a DataFrame or Series. By default, it returns the first 5 rows if no argument is specified.

What is the default number of rows returned by head()?

The default number of rows returned by the head() method in Pandas is 5. This means that when you call head() on a DataFrame or Series without specifying the number of rows, it will by default display the first 5 rows of the object.

What does the head() function return?

The head() function returns a DataFrame or Series that contains the first n rows of the original object.

Can I use head() on a DataFrame with a DateTime index?

You can use the head() function on a DataFrame with a DateTime index. The function will return the first n rows, preserving the DateTime index.

Can head() be used on a DataFrame with a MultiIndex?

The head() function can be used on a DataFrame with a MultiIndex. It will return the first n rows, including all levels of the MultiIndex.

Conclusion

In this article, you have learned the Pandas DataFrame head() function by using its syntax, parameters, usage, and how we can return the first n rows of the DataFrame with examples.

Happy Learning!!

Reference