• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:17 mins read
You are currently viewing Pandas Series.quantile() Function

In Pandas, the Series.quantile() function is used to compute the quantiles of a Series. Quantiles are statistical values that divide the data into four equal parts each representing 25% of the data. This function allows you to calculate the below quartiles.

  1. First Quartile (Q1): The value below 25% of the data falls.
  2. Second Quartile (Q2): The median, or the value below which 50% of the data falls.
  3. Third Quartile (Q3): The value below 75% of the data falls.

In this article, I will explain the Series.quantile() function and using its syntax, parameters, and usage how we can calculate various percentiles, such as the median (50th percentile), quartiles (25th and 75th percentiles), deciles (10th to 90th percentiles), etc., of a Series.

Key Points –

  • Series.quantile() calculates the quantiles of a Series, which divides the data into intervals with equal probabilities.
  • The q parameter specifies the quantile(s) to compute, with 0.5 being the default for the median (50th percentile).
  • The interpolation parameter determines the method used for interpolation when the desired quantile lies between two data points. Options include linear, lower, higher, midpoint, and nearest.
  • It returns the computed quantile(s) as a float or Series object, depending on the number of quantiles requested.
  • It offers flexibility in specifying the quantiles to compute, allowing users to explore different aspects of the distribution such as median, quartiles, and custom quantiles without complex manual calculations.

Syntax of Pandas Series.quantile() Function

Following is the syntax of the pandas Series.quantile() function.


# Syntax of Series.quantile() function
Series.quantile(q=0.5, interpolation='linear')

Parameters of the Series.quantile()

Following are the parameters of the quantile() function.

  • q – float or array-like, default 0.5 (median).
    • Specifies the quantile(s) to compute. It can be a single float or an array-like object containing floats representing the quantile(s) to calculate.
  • interpolation – ‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional
    • Specifies the interpolation method to use when the desired quantile lies between two data points. The default is ‘linear’.

Return Value

The Series.quantile() function in pandas returns the computed quantile(s). The return value depends on the input provided for the q parameter.

Get the Quantile to Calculate the Median (50th percentile)

To calculate the median (which corresponds to the 50th percentile) of a pandas Series, you can use the quantile() function with q=0.5.

Let’s create the Series using Python lists.


import pandas as pd

# Create a Series
series = pd.Series([2, 4, 6, 8, 10, 12])
print("Original Series:\n",series)

Yield below output.

pandas series quantile

Let’s apply the quantile() function to the given Series and get the median of the given Series which is the default quartile(0.5) value. For even numbers, the median value is the average of two middle numbers.


# Calculate median of the Series
median = series.quantile()
print("Calculate median:", median)

In the above example, the median of the Series [2, 4, 6, 8, 10, 12] is 7.0. This means that 50% of the data falls below the value of 7, and 50% falls above it. The middle values are at positions 3 and 4 (6 and 8), so Q2 is (6 + 8) / 2 = 7.

pandas series quantile

Calculate Lower Quartile (25th percentile)

Alternatively, to calculate the lower quartile, also known as the 25th percentile, of a pandas Series, you can use the quantile() function with q=0.25. Pass the 0.25 into the quantile() function, it will compute the first quartile (25th percentile) of the given Series.


# Calculate lower quartile
lower_quartile = series.quantile(0.25)
print("Lower Quartile (25th percentile):", lower_quartile)

# Output:
# Lower Quartile (25th percentile): 4

In the above example, To find the first quartile of a given Series step by step provide below,

  1. Calculate the number of elements in the Series(n) which is 6.
  2. Sort them in an ascending order [2, 4, 6, 8, 10, 12].
  3. Get the first quartile to calculate n * (1 / 4) i.e. (6*1/4 = 1.5). Since 1.5 is not an integer, so Q1 is the number at position 2.
  4. So Q1 is 4 at the position of 2.

Note: If * (1 / 4) is an integer, then the first quartile is the mean of the numbers at positions * (1 / 4) and * (1 / 4) + 1. If * (1 / 4) is not an integer, then round it up. The number at this position is the fisrt quartile.

Calculate Upper Quartile (75th percentile)

You can calculate the upper quartile (75th percentile) using the quantile() function in Pandas, you can pass 0.75 as the argument to the function.


# Calculate upper quartile
upper_quartile = series.quantile(0.75)
print("Upper Quartile (75th percentile):", upper_quartile)

# Output:
# Upper Quartile (75th percentile): 10

In the above example, To find the upper quartile of a given Series step by step provide below,

  1. Calculate the number of elements in the Series(n) which is 6.
  2. Sort them in an ascending order [2, 4, 6, 8, 10, 12].
  3. get the third quartile to calculate n * (3 / 4) i.e. (6*3/4 = 4.5). Since 4.5 is not an integer, so Q3 is the number at position 5.
  4. The Q3 value at the position of 5 is 10. So, the upper quartile of the series [2, 4, 6, 8, 10, 12] is 10.

Note: If * (3 / 4) is an integer, then the third quartile is the mean of the numbers at positions * (3 / 4) and * (3 / 4) + 1. If * (1 / 4) is not an integer, then round it up. The number at this position is the third quartile.

Calculate Deciles using Series quantile() Function

You can use the quartile() function with the desired quantile values to calculate deciles. Deciles divide the data into ten equal parts, each containing 10% of the data.


# Calculate deciles
deciles = [series.quantile(i/10) for i in range(1, 10)]
print("Calculate deciles:", deciles)

# Output:
# Calculate deciles: [3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0]

So, the deciles for the series [2, 4, 6, 8, 10, 12] are [3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0]. Each value represents the corresponding decile, starting from the 10th percentile up to the 90th percentile.

Calculate Custom Quantiles

Similarly, you can calculate custom quantiles by passing them an array-like object containing the desired quantile values to the quantile() function. It will give the quantile values of specified percentiles.


# Calculate custom quantiles
custom_quantiles = series.quantile([0.1, 0.3, 0.7, 0.9])
print("Custom Quantiles:\n", custom_quantiles)

# Output:
# Custom Quantiles:
# 0.1     3.0
# 0.3     5.0
# 0.7     9.0
# 0.9    11.0
# dtype: float64

Frequently Asked Questions on Pandas Series.quantile() Function

What is the purpose of the Series.quantile() function in pandas?

The Series.quantile() function in pandas is used to calculate quantiles of a Series, which divides the data into intervals with equal probabilities. This function allows users to compute various percentiles and quantiles of the data easily.

How do I calculate the median using Series.quantile()?

You can calculate the median using the Series.quantile() function by specifying the quantile value of 0.5, which corresponds to the median (50th percentile).

Can I calculate quartiles using Series.quantile()?

You can calculate quartiles using the Series.quantile() function in pandas. Quartiles are specific percentiles that divide a dataset into four equal parts. The lower quartile (Q1) corresponds to the 25th percentile, the median (Q2) corresponds to the 50th percentile, and the upper quartile (Q3) corresponds to the 75th percentile.

How do I calculate custom quantiles?

To calculate custom quantiles using the Series.quantile() function in pandas, you need to specify an array-like object containing the desired quantile values.

What does the interpolation parameter do in Series.quantile()?

The interpolation parameter in Series.quantile() specifies the method used for interpolation when the desired quantile lies between two data points. Options include ‘linear’, ‘lower’, ‘higher’, ‘midpoint’, and ‘nearest’. It helps in handling scenarios where the quantile value falls between two data points.

Conclusion

In this article, I have explained the quantile() function in Pandas is a powerful tool for calculating various percentiles, including the median, quartiles, deciles, and any other custom percentiles of a Series with examples.

Happy Learning!!

Related Articles

References

Malli

Malli is an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly articles. Over the years, he has written hundreds of articles in Pandas, NumPy, Python, and takes pride in ability to bridge the gap between technical experts and end-users.