• Post author:
  • Post category:NumPy / Python
  • Post last modified:March 27, 2024
  • Reading time:17 mins read
You are currently viewing NumPy percentile() Function

NumPy percentile() function in Python is used to compute the nth percentile of the array elements along the specified axis. You basically use percentile in statistics which gives you a number that describes the value that a given percent of the values are lower than.

Advertisements

In this article, I will explain the syntax of the NumPy percentile() function and use this function to compute the percentile values for 1-dimensional and 2-dimensional arrays along with specified parameters with examples.

1. Quick Examples of NumPy percentile() Function

If you are in a hurry, below are some quick examples of how to use NumPy percentile() function.


# Quick examples of numpy percentile() function

# Example 1: Get the 50th percentile of 1-D array
arr = np.array([2, 3, 5, 8, 9,4])
arr2 = np.percentile(arr, 50)

# Example 2: Get the 75th percentile of 1-D array  
arr2 = np.percentile(arr, 75)

# Example 3: Get the 50th percentile of 2-D array  
arr = np.array([[6, 8, 4],[ 9, 5, 7]])             
arr2 = np.percentile(arr, 50)

# Example 4: Get the percentile along the axis = 0               
arr2 = np.percentile(arr, 75, axis=0)

# Example 5: Get the percentile along the axis = 1                 
arr2 = np.percentile(arr, 75, axis=1)

# Example 6: Get the percentile of an array axis=1 and keepdims = true                
arr2 = np.percentile(arr, 75, axis=1, keepdims=True)

2. Syntax of NumPy percentile()

Following is the syntax of the numpy.percentile() function.


# Syntax of numpy.percentile() 
numpy.percentile(arr, percentile, axis=None, out=None, overwrite_input=False, keepdims=False)

2.1 Parameters of percentile()

The percentile() function allows the following parameters.

  • arr - array_like, input array or object. This is the array for which you want to compute the percentile.
  • percentile – array_like of float Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.
  • axis – Axis or axes along which the percentile is computed. By default, a flattened array is used. axis = 0 means along the column and axis = 1 means working along the row.
  • out – Alternative output array in which to place the result. If not specified, a new array is created.
  • overwrite_input – This optional parameter specifies the interpolation method to use when the desired percentile lies between two data points. The default is ‘linear’. If the boolean value is True, you can modify the input array through intermediate calculations, to save memory.
  • keepdims – If this is set to True, the axes which are reduced are left in the result as dimensions with size one.

2.2 Return Value of percentile()

It returns a scalar or array with percentile values along with the specified axis.

3. Usage of NumPy percentile() Function

In statistics, a percentile is a term that describes how a score compares to other scores from the same set. While there is no universal definition of percentile, it is commonly expressed as the percentage of values in a set of data scores that fall below a given value. Percentiles show how a given value compares to others. The general rule is that if a value is in the nth percentile, it is greater than nth percent of the total values.

For a better understanding, a student who scores 90 percentiles out of 100, and then it means 90% of students got less than 90 and 10% of students got more than 90.

To get the 50th percentile of a 1-D array using NumPy, you can use the numpy.percentile() function. For example, arr is your 1-D array, and np.percentile(arr, 50) calculates the 50th percentile (which is also the median) of the array. The result will be printed, and it represents the value below which 50% of the data falls.


# Import numpy
import numpy as np

# Create an 1D array
arr = np.array([2, 3, 5, 8, 9, 4])
print("Original array:",arr)

# Get the 50th percentile of 1-D array
arr2 = np.percentile(arr, 50)
print("50th Percentile:",arr2)

Yields below output.

This result indicates that the median of the array is 4.5, meaning that 50% of the values in the array are less than or equal to 4.5.

Alternatively, to get the 75th percentile of a 1-D array using NumPy, you can use the numpy.percentile() function. For instance, arr is your 1-D array, and np.percentile(arr, 75) calculates the 75th percentile of the array. The result will be printed, and it represents the value below which 75% of the data falls.


# Get the 75th percentile of 1-D array  
arr2 = np.percentile(arr, 75)
print("75th Percentile:",arr2)

# Output:
# 75th Percentile: 7.25

This result indicates that the value 7.75 is the 75th percentile of the array, meaning that 75% of the values in the array are less than or equal to 7.75.

4. Get the Percentile Value of 2-D Array

To get the 50th percentile of a 2-D array using NumPy, you can use the numpy.percentile() function. For instance, arr is your 2-D array, and np.percentile(arr, 50) calculates the 50th percentile of the entire 2-D array. The result will be printed, and it represents the value below which 50% of the data falls when considering all the values in the array.


# Create 2-D array
arr = np.array([[6, 8, 4],[ 9, 5, 7]])
print("Original 2D array:\n",arr)

# Get the 50th percentile of 2-D array               
arr2 = np.percentile(arr, 50)
print("50th percentile of 2-D array:\n",arr2)

# Output:
# Original 2D array:
#  [[6 8 4]
#  [9 5 7]]
# 50th percentile of 2-D array:
#  6.5

5. Get the Percentile along the Axis

If you want to compute percentiles along a specific axis in a 2-D array, you can use the numpy.percentile() function with the axis parameter

  • arr2 calculates the 75th percentile along axis 0 (rows). The result will be an array containing the 75th percentile for each column.
  • arr3 calculates the 75th percentile along axis 1 (columns). The result will be an array containing the 75th percentile for each row.

These results indicate the 75th percentile along each column (axis 0) and the 75th percentile along each row (axis 1) of the 2-D array arr. Adjust the axis parameter according to your specific requirements.


# Get the percentile along the axis = 0               
arr2 = np.percentile(arr, 75, axis=0)
print("75th Percentile along axis 0 (rows):\n",arr2)

# Output:
# 75th Percentile along axis 0 (rows):
#  [8.25 7.25 6.25]

# Get the percentile along the axis = 1                 
arr3 = np.percentile(arr, 75, axis=1)
print("75th Percentile along axis 1 (columns):\n",arr3)

# Output:
# 75th Percentile along axis 1 (columns):
#  [7. 8.]

6. Use axis=1 and keepdims = True

You can also compute the percentile value of an array along with the specified axis and keepdims, keepdims argument keeps the dimensions in the result.

To calculate the 75th percentile along axis 1 (columns) of the 2-D array arr while keeping the dimensions using keepdims=True. Using the numpy.percentile() function to calculate the 75th percentile along axis 1 (columns) of the 2-D array arr. The result is stored in the variable arr2. The keepdims=True parameter ensures that the dimensions of the result are kept.


# Get the percentile of an array axis=1 and keepdims = true                
arr2 = np.percentile(arr, 75, axis=1, keepdims=True)
print("75th Percentile along axis 1 with keepdims=True:\n",arr2)

# Output:
# 75th Percentile along axis 1 with keepdims=True:
#  [[7.]
#  [8.]]

What does the numpy.percentile() function do?

The numpy.percentile() function calculates the q-th percentile of the data along a specified axis. Percentiles are statistical measures that indicate the relative standing of a particular value within a dataset. The result represents the value below which a given percentage of observations in a group of observations falls.

How is the percentile specified in the function?

The q parameter in the function specifies the percentile to be computed, and it must be between 0 and 100, inclusive.

How can I calculate percentiles along a specific axis?

You can use the axis parameter to specify the axis along which the percentiles are computed. For example, axis=0 calculates percentiles along columns, and axis=1 calculates percentiles along rows.

What does the keepdims parameter do?

The keepdims parameter, when set to True, retains the reduced dimensions with size 1 in the output. This is useful when you want to keep the result’s dimensions consistent with the input array.

What is the difference between median and percentile?

The median is a specific percentile, namely the 50th percentile. It represents the middle value of a dataset. Percentiles, in general, represent the value below which a certain percentage of observations fall. The median is the value below which 50% of the observations fall.

Can I use the numpy.percentile() function with a 2-D array?

You can absolutely use the numpy.percentile() function with a 2-D array. The function is designed to handle arrays of multiple dimensions, and you can specify the axis along which you want to calculate the percentiles using the axis parameter.

Conclusion

In this article, I have explained how to use NumPy percentile() function and using this function, how to get percentile values for 1-dimension and 2-dimension arrays along with specified parameters.

Happy Learning!!

References