Pandas header()
function is used to get the top N rows from DataFrame or top N elements from a Series. When used negative number it returns all except the last N rows. This function is mainly used for testing to check if the object contains the right type of Data.
When you wanted to extract only top N rows after all your filtering and transformations, you can use the head()
method, which is defined in the Pandas library.
Key Points –
- The
head()
method is used to quickly return the first N rows of a DataFrame or Series. head()
function exists in Series and DataFrame.- It returns the same object as caller.
- When no param is used on
head()
, by default returns top 5 rows. - Passing a negative value to
n
will return all rows except the lastn
rows. - If called on an empty DataFrame, it will return an empty DataFrame with the same columns.
- The
head()
method does not alter the original DataFrame or Series; it only returns a view of the data.
head() Syntax
Following is the syntax of the head() method of the DataFrame and Series.
# Syntax of head() method
DataFrame.head(n)
Series.head(n)
Let’s create a DataFrame with Dict
# Create a pandas DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Courses' :['Spark','Python','Java','C++','Hadoop','R','C#','AWS'],
'Fee' :[22000,25000,23000,22000,30000,22000,32000,40000],
'Duration':['30days','50days','30days','35days','40days','45days','50days','60days']
})
print(df)
# Output:
# Courses Fee Duration
# 0 Spark 22000 30days
# 1 Python 25000 50days
# 2 Java 23000 30days
# 3 C++ 22000 35days
# 4 Hadoop 30000 40days
# 5 R 22000 45days
# 6 C# 32000 50days
# 7 AWS 40000 60days
DataFrame.head() Example
Pandas DataFrame.head()
method is used to get the top or bottom N rows of the DataFrame. When a positive number is used, it returns top N rows.
For negative numbers, it returns the rows except last N.
This function is mostly used for testing purposes to check if it contains the right data.
By default, without N value as param, it returns the top 5 rows.
# Default return 5 rows
print(df.head())
# Output:
# Courses Fee Duration
# 0 Spark 22000 30days
# 1 Python 25000 50days
# 2 Java 23000 30days
# 3 C++ 22000 35days
# 4 Hadoop 30000 40days
To get the top 3 rows, use value 3 for N param.
# Top N rows
print(df.head(3))
# Output:
# Courses Fee Duration
# 0 Spark 22000 30days
# 1 Python 25000 50days
# 2 Java 23000 30days
Use negative numbers for N to get the rows except for N rows from the last. You can also achieve the same result by using df[:n]
. The below example ignores the last 3 records and returns the remaining.
# Except last N rows
print(df.head(-3))
# Output:
# Courses Fee Duration
# 0 Spark 22000 30days
# 1 Python 25000 50days
# 2 Java 23000 30days
# 3 C++ 22000 35days
# 4 Hadoop 30000 40days
# 5 R 22000 45days
Pandas Series.head() Example
Pandas series.head()
function is used to get the top N elements. When used negative integer it returns elements except for the last N. Since each column of a DataFrame is a Series, I will use one column from above DataFrame to explain.
By default, without N value as param, it returns the top 5 elements.
# head() example
print(df['Fee'].head())
# Output:
# 0 22000
# 1 25000
# 2 23000
# 3 22000
# 4 30000
# Name: Fee, dtype: int64
Top N elements from Series
# Top N elements from Series
print(df['Fee'].head(3))
# Output:
# 0 22000
# 1 25000
# 2 23000
# Name: Fee, dtype: int64
To get except the last N elements, use the negative N value.
# Except last n rows
print(df['Fee'].head(-3))
# Output:
# 0 22000
# 1 25000
# 2 23000
# 3 22000
# 4 30000
# Name: Fee, dtype: int64
FAQ on Pandas head()
The head()
function in Pandas is used to return the first N rows of a DataFrame or Series. By default, it returns the top 5 rows.
You can pass the number of rows you want to view as an argument to the head()
function.
If no argument is provided to head()
, it defaults to returning the first 5 rows of a DataFrame or Series.
head()
can be used on both DataFrames and Series. For a Series, it will also return the top N elements.
If the number of rows you request with head()
exceeds the number of rows in the DataFrame, it will return the entire DataFrame without any errors.
head()
does not modify the DataFrame in place. It simply returns a new view of the top rows. If you want to assign the result to a new variable.
Conclusion
In this article, you have learned the syntax and usage of the head()
function with examples. Also learned head()
is used to get the top N elements and it is available in both DataFrame and Series. This returns the same object as the caller.
Related Articles
- Pandas Groupby Sort within Groups
- Get First N Rows of Pandas DataFrame
- Pandas Groupby Aggregate Explained
- Pandas Iterate Over Columns of DataFrame
- Pandas Get First Column of DataFrame as Series?
- How to Rename Column(s) in pandas DataFrame
- How to add/insert row to Pandas DataFrame?
- How to Iterate over DataFrame Rows in pandas