In pandas, the head()
method is used to return the first n rows of a DataFrame. By default, it returns the first 5 rows, but you can specify a different number of rows by passing an integer as an argument.
In this article, I will explain the Pandas DataFrame head()
method by using its syntax, parameters, usage, and how to return a DataFrame or Series containing the initial n rows from the original object.
Key Points –
- The
head()
method is used to quickly preview the first n rows of a DataFrame or Series. - By default,
head()
returns the first 5 rows of the DataFrame if no parameter is specified. - The method accepts a single optional parameter,
n
, which specifies the number of rows to return from the top of the DataFrame. - You can specify the number of rows to return by passing an integer as an argument to the method.
- It returns a new DataFrame containing the first n rows, preserving the original DataFrame’s structure and data types.
Pandas DataFrame head() Introduction
Let’s know the syntax of the head() method.
# Syntax of Pandas DataFrame head()
DataFrame.head(n=5)
Parameters of the DataFrame head()
It allows only one parameter.
n
(int, optional)
– An integer specifying the number of rows to return. The default value is 5.
Return Value
It returns the first n rows of the DataFrame.
Usage of Pandas DataFrame head() Method
The Pandas DataFrame head()
method is used to quickly inspect the beginning of a DataFrame or Series.
To run some examples of the Pandas DataFrame head() function, let’s create a Pandas DataFrame using data from a dictionary.
import pandas as pd
technologies= {
'Courses':["Spark", "PySpark", "Hadoop", "Python", "Pandas", "PySpark", "Java"],
'Fee' :[22000, 25000, 23000, 24000, 26000, 30000, 35000],
'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
'Duration':['35days', '35days', '40days', '30days', '25days', '50days', '45days']
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.
The default usage of the head()
method in Pandas is to return the first 5 rows of a DataFrame when no argument is specified.
# Using the default usage of the head() method
print("First 5 rows of the DataFrame (default usage):\n", df.head())
Yields below output.
Specifying Number of Rows
Alternatively, you can specify the number of rows to return with the head()
method by passing an integer argument. This allows you to preview a custom number of rows from the beginning of a DataFrame.
# Specifying the number of rows to return
print("First 3 rows of the DataFrame:\n", df.head(3))
# Output:
# First 3 rows of the DataFrame:
# Courses Fee Discount Duration
# 0 Spark 22000 1000 35days
# 1 PySpark 25000 2300 35days
# 2 Hadoop 23000 1000 40days
In the above example, By passing an integer argument to head()
, you can specify the number of rows to return.
Displaying More Than 5 Rows
To display more than the default 5 rows using the head()
method in Pandas, you can specify the desired number of rows by passing an integer argument to the method. This allows you to preview a larger portion of your DataFrame.
# Displaying the first 6 rows of the DataFrame
print("First 6 rows of the DataFrame:\n", df.head(6))
# Output:
# First 6 rows of the DataFrame:
# Courses Fee Discount Duration
# 0 Spark 22000 1000 35days
# 1 PySpark 25000 2300 35days
# 2 Hadoop 23000 1000 40days
# 3 Python 24000 1200 30days
# 4 Pandas 26000 2500 25days
# 5 PySpark 30000 2000 50days
By specifying an integer argument greater than 5, you can view more rows from the beginning of your DataFrame.
Using head() on a DataFrame with DateTime Index
Similarly, using the head()
method on a DataFrame with a DateTime index works the same way as with any other DataFrame. This method will return the first n rows of the DataFrame, which is especially useful for time series data where you often need to inspect the initial entries.
import pandas as pd
# Creating a date range
dates = pd.date_range('2024-06-01', periods=7)
# Creating the DataFrame with DateTime index
data = {
'Temperature': [22, 21, 19, 23, 20, 18, 21],
'Humidity': [55, 60, 58, 57, 59, 61, 62]
}
df = pd.DataFrame(data, index=dates)
print("Create DataFrame with DateTime index:\n", df)
# Using head() method to display the first 4 rows
print("First 4 rows of the DataFrame:\n", df.head(4))
# Output:
# Create DataFrame with DateTime index:
# Temperature Humidity
# 2024-06-01 22 55
# 2024-06-02 21 60
# 2024-06-03 19 58
# 2024-06-04 23 57
# 2024-06-05 20 59
# 2024-06-06 18 61
# 2024-06-07 21 62
# First 4 rows of the DataFrame:
# Temperature Humidity
# 2024-06-01 22 55
# 2024-06-02 21 60
# 2024-06-03 19 58
# 2024-06-04 23 57
Using head()
on a DataFrame with a DateTime index returns the first n rows while maintaining the DateTime index.
Using head() on a DataFrame with a MultiIndex
Finally, using the head()
method on a DataFrame with a MultiIndex is similar to using it on a regular DataFrame. The head()
method will return the first n rows of the DataFrame, including all levels of the MultiIndex.
import pandas as pd
# Creating a MultiIndex
technologies= [
["Spark", "Spark", "PySpark", "PySpark", "Python", "Pandas", "Pandas"],
[22000, 25000, 23000, 24000, 26000, 30000, 35000],
]
index = pd.MultiIndex.from_arrays(technologies, names=('Courses', 'Fee'))
# Creating the DataFrame with MultiIndex
data = {
'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
}
df = pd.DataFrame(data, index=index)
print("Create DataFrame with MultiIndex:\n", df)
# Using head() method to display the first 4 rows
print("First 4 rows of the DataFrame:\n", df.head(4))
# Output:
# Create DataFrame with MultiIndex:
# Discount
# Courses Fee
# Spark 22000 1000
# 25000 2300
# PySpark 23000 1000
# 24000 1200
# Python 26000 2500
# Pandas 30000 2000
# 35000 2200
# First 4 rows of the DataFrame:
# Discount
# Courses Fee
# Spark 22000 1000
# 25000 2300
# PySpark 23000 1000
# 24000 1200
FAQ on Pandas DataFrame head() Function
The head()
function in Pandas is used to return the first n rows of a DataFrame or Series. By default, it returns the first 5 rows if no argument is specified.
The default number of rows returned by the head()
method in Pandas is 5. This means that when you call head()
on a DataFrame or Series without specifying the number of rows, it will by default display the first 5 rows of the object.
The head()
function returns a DataFrame or Series that contains the first n rows of the original object.
You can use the head()
function on a DataFrame with a DateTime index. The function will return the first n rows, preserving the DateTime index.
The head()
function can be used on a DataFrame with a MultiIndex. It will return the first n rows, including all levels of the MultiIndex.
Conclusion
In this article, you have learned the Pandas DataFrame head()
function by using its syntax, parameters, usage, and how we can return the first n rows of the DataFrame with examples.
Happy Learning!!
Related Articles
- Pandas DataFrame copy() Function
- Pandas DataFrame sum() Method
- Pandas DataFrame corr() Method
- Pandas DataFrame assign() Method
- How to Unpivot DataFrame in Pandas?
- How to Compare Two Columns Using Pandas?