In Pandas, the tail()
method is used to return the last n rows of a DataFrame. By default, it returns the last 5 rows, but you can specify a different number of rows if desired.
In this article, I will explain the Pandas DataFrame tail()
method by using its syntax, parameters, and usage, and how to return a new DataFrame containing the specified number of rows from the end of the original DataFrame.
Key Points –
- The
tail()
method returns the last n rows of a DataFrame. - By default,
tail()
returns the last 5 rows if no argument is provided. - The method accepts an integer parameter
n
that specifies the number of rows to return. - You can specify a different number of rows by passing an integer to the method.
- If a negative integer is passed,
tail()
will return all rows except the first n rows.
Pandas DataFrame tail() Introduction
Let’s know the syntax of the tail() method.
# Syntax of Pandas dataframe tail()
DataFrame.tail(n=5)
Parameters of the DataFrame tail()
It allows only one parameter.
n
– (int, default 5) The number of rows to return from the end of the DataFrame.
Return Value
It returns the last n rows of the DataFrame or Series.
Usage of Pandas DataFrame tail() Method
The tail()
method in pandas is used to return the last n rows of a DataFrame or Series. By default, it returns the last 5 rows if no parameter is specified.
To run some examples of the Pandas DataFrame tail() function, let’s create a Pandas DataFrame using data from a dictionary.
import pandas as pd
technologies= {
'Courses':["Spark", "PySpark", "Hadoop", "Python", "Pandas", "PySpark", "Java"],
'Fee' :[22000, 25000, 23000, 24000, 26000, 30000, 35000],
'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
'Duration':['35days', '40days', '45days', '50days', '30days', '25days', '45days']
}
df = pd.DataFrame(technologies)
print("Original DataFrame:\n", df)
Yields below output.
The default usage of the tail()
method in Pandas, without specifying the number of rows, returns the last 5 rows of the DataFrame.
# Using tail() method with default (last 5 rows)
print("Last 5 rows of the DataFrame (default usage):\n", df.tail())
In the above example, df.tail()
without any arguments gives you the last 5 rows of the df
DataFrame. This is useful for quickly inspecting the end of the DataFrame and seeing the most recent entries.
Specify Number of Rows (last 3 rows)
To specify the number of rows you want to retrieve using the tail()
method in Pandas, you can pass an integer argument indicating how many rows from the end of the DataFrame you want to see.
# Using tail() method
# To get the last 3 rows
print("Last 3 rows of the DataFrame:\n", df.tail(3))
# Output:
# Last 3 rows of the DataFrame:
# Courses Fee Discount Duration
# 4 Pandas 26000 2500 30days
# 5 PySpark 30000 2000 25days
# 6 Java 35000 2200 45days
In the above example, df.tail(3)
specifies that we want to retrieve the last 3 rows of the df
DataFrame. This can be particularly useful for examining recent data entries or for validating changes made to the DataFrame.
Tail with Single Column DataFrame
Alternatively, using the tail()
method with a single-column DataFrame works just like it does with multi-column DataFrames.
import pandas as pd
# Creating a single-column DataFrame
df = pd.DataFrame({'A': [10, 20, 30, 40, 50, 60, 70]})
# Using tail() method to get the last 3 rows
print("Last 3 rows of the single-column DataFrame:\n", df.tail(3))
# Output:
# Last 3 rows of the single-column DataFrame:
# A
# 4 50
# 5 60
# 6 70
In the above example, we create a single-column DataFrame named df
with one column A
. We use the tail(3)
method to retrieve the last 3 rows of the DataFrame.
Tail with Multi-Index DataFrame
To use the tail()
method with a Multi-Index DataFrame, you need to first create a DataFrame with multiple levels of indexing. Then, you can call the tail()
method just as you would with a standard DataFrame.
import pandas as pd
# Creating a MultiIndex
technologies= [
["Spark", "Spark", "PySpark", "PySpark", "Python", "Pandas", "Pandas"],
[22000, 25000, 23000, 24000, 26000, 30000, 35000],
]
index = pd.MultiIndex.from_arrays(technologies, names=('Courses', 'Fee'))
# Creating the DataFrame with MultiIndex
data = {
'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
}
df = pd.DataFrame(data, index=index)
print("Create DataFrame with MultiIndex:\n", df)
# Using head() method to display the first 4 rows
print("Last 4 rows of the DataFrame:\n", df.tail(4))
# Output:
# Create DataFrame with MultiIndex:
# Discount
# Courses Fee
# Spark 22000 1000
# 25000 2300
# PySpark 23000 1000
# 24000 1200
# Python 26000 2500
# Pandas 30000 2000
# 35000 2200
# Last 4 rows of the DataFrame:
# Discount
# Courses Fee
# PySpark 24000 1200
# Python 26000 2500
# Pandas 30000 2000
# 35000 2200
In the above examples, we create a Multi-Index DataFrame named df
with two levels of indices, Courses
, and Fee
. We use the tail()
method to retrieve the last 4 rows of the DataFrame.
Using tail() on a DataFrame with DateTime Index
Similarly, using the tail()
method on a DataFrame with a DateTime index allows you to inspect the last few rows based on dates.
import pandas as pd
# Creating a DateTime index
dates = pd.date_range('20240101', periods=10)
# Creating a DataFrame with DateTime index
data = {
'Sales': [250, 300, 450, 500, 650, 700, 850, 900, 1050, 1100],
'Revenue': [2000, 2100, 2500, 2700, 3000, 3200, 3500, 3700, 4000, 4200]
}
df_datetime = pd.DataFrame(data, index=dates)
# Using tail() method
# To get the last 3 rows
print("Last 3 rows of the DataFrame with DateTime index:\n", df_datetime.tail(3))
# Output:
# Last 3 rows of the DataFrame with DateTime index:
# Sales Revenue
# 2024-01-08 900 3700
# 2024-01-09 1050 4000
# 2024-01-10 1100 4200
In the above example, a DateTime index is created using pd.date_range()
with 10 dates starting from 2024-01-01. A DataFrame named df_datetime
is then created with two columns, Sales
and Revenue
, utilizing this DateTime index. Finally, the tail(3)
method retrieves the last 3 rows of the DataFrame.
Frequently Asked Questions on Pandas DataFrame tail() Method
The tail()
method returns the last n rows of a DataFrame. By default, it returns the last 5 rows if no argument is specified.
You can specify the number of rows you want by passing an integer to the method. For example, df.tail(3)
will return the last 3 rows of the DataFrame df
.
The tail()
method works seamlessly with DataFrames that have a DateTime index, allowing you to inspect the last few rows of time series data.
The head()
method returns the first n rows of a DataFrame, while the tail()
method returns the last n rows. By default, both methods return 5 rows if no argument is specified.
The tail()
method can also be used on a Pandas Series to return the last n elements of the Series.
Conclusion
In this article, I have explained the Pandas DataFrame tail()
function, including its syntax, parameters, and usage. The return value of the tail()
method in Pandas is a subset of the original DataFrame or Series, containing the last n rows or elements.
Happy Learning!!
Related Articles
- Pandas DataFrame sum() Method
- Pandas DataFrame shift() Function
- Pandas DataFrame equals() Method
- Pandas DataFrame info() Function
- Pandas DataFrame head() Method
- Pandas melt() DataFrame Example
- Pandas Rolling Sum
- Pandas DataFrame quantile() Function
- Pandas DataFrame sample() Function
- Pandas DataFrame describe() Method
- How to Unpivot DataFrame in Pandas?
- Pandas Merge DataFrames on Multiple Columns