• Post author:
  • Post category:Pandas
  • Post last modified:December 11, 2024
  • Reading time:15 mins read
You are currently viewing Pandas DataFrame quantile() Function

Pandas DataFrame quantile() function is used to return values at the given quantile over the requested axis. In other words, DataFrame.quantile() function helps the user calculate the quantile of the values in a given axis that returns Series or DataFrame.

Advertisements

While getting the quantile, this function arranges the data in ascending order and we can use the formula to get the position that is q*(n+1) where q is the quantile and n is the total number of elements. In this article, I will explain the pandas DataFrame quantile() function that returns Series or DataFrame.

Key Points –

  • The quantile() function is used to calculate the value at a given quantile, which is a specified percentile of the data distribution.
  • Quantile values must be between 0 and 1, where 0 represents the minimum, 0.5 the median, and 1 the maximum value.
  • The quantile() function is primarily applicable to numerical data and ignores non-numeric columns by default.
  • The axis parameter allows you to specify whether to calculate quantiles along rows (axis=0, default) or columns (axis=1).
  • If the exact quantile falls between two data points, the interpolation parameter determines how the result is calculated (options include linear, lower, higher, nearest, and midpoint).
  • Calculating quantiles can be computationally intensive for very large datasets, especially if done repeatedly or for multiple quantiles.

Quick Examples of quantile() Function

If you are in a hurry, below are some quick examples of pandas DataFrame quantile() function.


# Quick examples of quantile() function

# Example 1: Use quantile() function
df2 = df.quantile(0.6)

# Example 2: Using quantile() function 
# Get quantiles along the index axis
df2 = df.quantile([0.25, 0.5, 0.75], axis = 0)

# Example 3: Using quantile() function 
# Get the quantiles along the index axis = 0
df2 = df.quantile(0.4, axis = 0)

# Example 4: Using quantile() function 
# Get the quantiles along the index axis =1
df2 = df.quantile(0.5, axis = 1)

Syntax of Pandas DataFrame.quantile()

Following is the syntax of the Pandas DataFrame.quantile().


# Syntax of DataFrame.quantile()
DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')

Parameters of the quantile() Function

Following are the parameters of the quantile() function.

  • q – It represents the float or array-like, and the default is 0.5 (50% quantile). The value between 0 <= q <= 1, the quantile(s) to compute.
  • axis – axis or axes represents the columns and rows. If axis=1 it represents the columns, and if axis=0, then it represents the rows.
  • numeric_only – It represents bool(True or False), the default is True. If the parameter is False, the quantile of DateTime and time delta information will be registered too.
  • interpolation – {‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}: This optional parameter specifies that is always assigned to linear by default.

Return Value quantile()

It returns Series or DataFrame.

Create Pandas DataFrame

Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named Numpy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.

Lets create pandas DataFrame from Python dictionary in which keys are "Student Names",'Age','Height' and 'Weight', and values are taken as a list of corresponding key values.


import pandas as pd
# Create a DataFrame
technologies = [
            ("Jenny", 22, 140, 40),
            ("Charles", 28, 145, 50),
            ("Veena", 34, 130, 45)
            ]
df = pd.DataFrame(technologies,columns = ["Student Names",'Age','Height','Weight'])
print(df)

Yields below output.


# Output:
  Student Names  Age  Height  Weight
0         Jenny   22     140      40
1       Charles   28     145      50
2         Veena   34     130      45

Use quantile() Function

By using the quantile() function let’s calculate the quantile at 0.6 of the pandas DataFrame. This calculates the quantile of every numeric columns and excludes the character columns.


# Use quantile() function
df2 = df.quantile(0.6)
print(df2)

Yields below output.


# Output:
Age        29.2
Height    141.0
Weight     46.0
Name: 0.6, dtype: float64

We can also get the (0.25, 0.5, 0.75) quantiles along the index axis, using the quantile() function.


# Using quantile() function 
# get quantiles along the index axis
df2 = df.quantile([0.25, 0.5, 0.75], axis = 0)
print(df2)

Yields below output.


# Output:
       Age  Height  Weight
0.25  25.0   135.0    42.5
0.50  28.0   140.0    45.0
0.75  31.0   142.5    47.5

Get the Quantile Along the Axis = 0

Create a DataFrame and get the quantile at 0.4 using the df.quantile() function. we pass the first parameter for the function as 0.4 and pass the axis parameter as 0 so that the quantiles are calculated in columns.


# Using quantile() function 
# get the quantiles along the index axis = 0
df2 = df.quantile(0.4, axis = 0)
print(df2)

Yields below output.


# Output:
Age        26.8
Height    138.0
Weight     44.0
Name: 0.4, dtype: float64

Get the Quantile Along the Axis = 1

Create a DataFrame by calculating quantile at 0.5 using the DataFrame.quantile() function over the column axis. Following the below example, at index ‘0’, the quantile is 40.0 for three values, at index ‘1’ the quantile is 50.0 for three values.


# Using quantile() function 
# get the quantiles along the index axis =1
df2 = df.quantile(0.5, axis = 1)
print(df2)

Yields below output.


# Output:
0    40.0
1    50.0
2    45.0
Name: 0.5, dtype: float64

FAQ on Pandas DataFrame quantile() Function

What does the quantile() function do?

The quantile() function calculates the value at a given quantile along the specified axis of a DataFrame. For instance, you can find the median (50th percentile) or any other percentile value.

How to calculate multiple quantiles at once?

To calculate multiple quantiles at once in Pandas, you can use the quantile() function and pass a list of quantiles to it. The function returns the specified quantiles for the entire DataFrame or a Series, depending on the context.

What happens if the DataFrame contains non-numeric columns?

By default, only numeric columns are used if numeric_only=True. If set to False, an error will occur for non-numeric data.

How is interpolation used in quantile()?

You can calculate multiple quantiles in a single call by passing a list or an array of quantiles to the q parameter. This returns a DataFrame (for axis=0) or Series (for axis=1) with the corresponding quantile values.

How do I calculate the 95th percentile?

To calculate the 95th percentile of data in a Pandas DataFrame or Series, you can use the quantile() function and pass 0.95 as the argument.

Conclusion

In this article, you have learned the pandas DataFrame quantile() function by using DataFrame.quantile() function and with more examples. and you have also learned the syntax, and parameters of DataFrame.quantile() function.

Happy Learning !!

References