Pandas DataFrame quantile()
function is used to return values at the given quantile over the requested axis. In other words, DataFrame.quantile()
function helps the user calculate the quantile of the values in a given axis that returns Series or DataFrame.
While getting the quantile, this function arranges the data in ascending order and we can use the formula to get the position that is q*(n+1) where q is the quantile and n is the total number of elements. In this article, I will explain the pandas DataFrame quantile() function that returns Series or DataFrame.
Key Points –
- The
quantile()
function is used to calculate the value at a given quantile, which is a specified percentile of the data distribution. - Quantile values must be between 0 and 1, where 0 represents the minimum, 0.5 the median, and 1 the maximum value.
- The
quantile()
function is primarily applicable to numerical data and ignores non-numeric columns by default. - The
axis
parameter allows you to specify whether to calculate quantiles along rows (axis=0
, default) or columns (axis=1
). - If the exact quantile falls between two data points, the
interpolation
parameter determines how the result is calculated (options includelinear
,lower
,higher
,nearest
, andmidpoint
). - Calculating quantiles can be computationally intensive for very large datasets, especially if done repeatedly or for multiple quantiles.
Quick Examples of quantile() Function
If you are in a hurry, below are some quick examples of pandas DataFrame quantile()
function.
# Quick examples of quantile() function
# Example 1: Use quantile() function
df2 = df.quantile(0.6)
# Example 2: Using quantile() function
# Get quantiles along the index axis
df2 = df.quantile([0.25, 0.5, 0.75], axis = 0)
# Example 3: Using quantile() function
# Get the quantiles along the index axis = 0
df2 = df.quantile(0.4, axis = 0)
# Example 4: Using quantile() function
# Get the quantiles along the index axis =1
df2 = df.quantile(0.5, axis = 1)
Syntax of Pandas DataFrame.quantile()
Following is the syntax of the Pandas DataFrame.quantile().
# Syntax of DataFrame.quantile()
DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')
Parameters of the quantile() Function
Following are the parameters of the quantile() function.
q
– It represents the float or array-like, and the default is 0.5 (50% quantile). The value between 0 <= q <= 1, the quantile(s) to compute.axis
– axis or axes represents the columns and rows. If axis=1 it represents the columns, and if axis=0, then it represents the rows.numeric_only
– It represents bool(True or False), the default is True. If the parameter is False, the quantile of DateTime and time delta information will be registered too.interpolation
– {‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}: This optional parameter specifies that is always assigned to linear by default.
Return Value quantile()
It returns Series or DataFrame.
Create Pandas DataFrame
Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named Numpy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.
Lets create pandas DataFrame from Python dictionary in which keys
are "Student Names",'Age','Height' and 'Weight'
, and values
are taken as a list of corresponding key values
.
import pandas as pd
# Create a DataFrame
technologies = [
("Jenny", 22, 140, 40),
("Charles", 28, 145, 50),
("Veena", 34, 130, 45)
]
df = pd.DataFrame(technologies,columns = ["Student Names",'Age','Height','Weight'])
print(df)
Yields below output.
# Output:
Student Names Age Height Weight
0 Jenny 22 140 40
1 Charles 28 145 50
2 Veena 34 130 45
Use quantile() Function
By using the quantile()
function let’s calculate the quantile at 0.6
of the pandas DataFrame. This calculates the quantile of every numeric columns and excludes the character columns.
# Use quantile() function
df2 = df.quantile(0.6)
print(df2)
Yields below output.
# Output:
Age 29.2
Height 141.0
Weight 46.0
Name: 0.6, dtype: float64
We can also get the (0.25, 0.5, 0.75)
quantiles along the index axis, using the quantile()
function.
# Using quantile() function
# get quantiles along the index axis
df2 = df.quantile([0.25, 0.5, 0.75], axis = 0)
print(df2)
Yields below output.
# Output:
Age Height Weight
0.25 25.0 135.0 42.5
0.50 28.0 140.0 45.0
0.75 31.0 142.5 47.5
Get the Quantile Along the Axis = 0
Create a DataFrame and get the quantile at 0.4
using the df.quantile()
function. we pass the first parameter for the function as 0.4
and pass the axis parameter as 0
so that the quantiles are calculated in columns.
# Using quantile() function
# get the quantiles along the index axis = 0
df2 = df.quantile(0.4, axis = 0)
print(df2)
Yields below output.
# Output:
Age 26.8
Height 138.0
Weight 44.0
Name: 0.4, dtype: float64
Get the Quantile Along the Axis = 1
Create a DataFrame by calculating quantile at 0.5 using the DataFrame.quantile()
function over the column axis. Following the below example, at index ‘0’, the quantile is 40.0 for three values, at index ‘1’ the quantile is 50.0 for three values.
# Using quantile() function
# get the quantiles along the index axis =1
df2 = df.quantile(0.5, axis = 1)
print(df2)
Yields below output.
# Output:
0 40.0
1 50.0
2 45.0
Name: 0.5, dtype: float64
FAQ on Pandas DataFrame quantile() Function
The quantile()
function calculates the value at a given quantile along the specified axis of a DataFrame. For instance, you can find the median (50th percentile) or any other percentile value.
To calculate multiple quantiles at once in Pandas, you can use the quantile()
function and pass a list of quantiles to it. The function returns the specified quantiles for the entire DataFrame or a Series, depending on the context.
By default, only numeric columns are used if numeric_only=True
. If set to False
, an error will occur for non-numeric data.
You can calculate multiple quantiles in a single call by passing a list or an array of quantiles to the q
parameter. This returns a DataFrame (for axis=0) or Series (for axis=1) with the corresponding quantile values.
To calculate the 95th percentile of data in a Pandas DataFrame or Series, you can use the quantile()
function and pass 0.95
as the argument.
Conclusion
In this article, you have learned the pandas DataFrame quantile()
function by using DataFrame.quantile()
function and with more examples. and you have also learned the syntax, and parameters of DataFrame.quantile()
function.
Happy Learning !!
Related Articles
- Pandas DataFrame count() Function
- Pandas Aggregate Functions with Examples
- Convert Pandas DatetimeIndex to String
- Python Pandas – Convert JSON to CSV
- How to Read CSV from String in Pandas
- How to Convert Pandas Series to list
- How to Get Size of Pandas DataFrame?
- pandas.DataFrame.sort_values() – Examples
- How to Convert Pandas DataFrame to List?
- How to Count Duplicates in Pandas DataFrame
- Pandas – Check Any Value is NaN in DataFrame
- Pandas Select All Columns Except One Column
- Pandas – What is a DataFrame Explained With Examples