How to Use median() in NumPy?

  • Post author:
  • Post category:NumPy / Python
  • Post last modified:November 14, 2023
  • Reading time:18 mins read

Python NumPy median() function is used to compute the median of the given NumPy array over the specified axis or multiple axes. The term median is the value separating the higher half from the lower half of a data sample in other words median is a value in the middle when you sorted the values in the ascending order.

The numpy.median() function is used to calculate the median of single-dimensional as well as multi-dimensional arrays. In this article, I will explain how to use the NumPy median() function in Python to return the median of the array elements.

1. Quick Examples of median() Function

If you are in a hurry, below are some quick examples of how to compute Python NumPy median() function.


# Quick examples of median() function

# Example 1: Compute the median value of 1D array
arr = [12, 7, 15, 8, 9, 5, 3]
median_value = np.median(arr)

# Example 2: Get the median value of 2D array
arr = np.array([[5, 9, 7, 11], [8, 14, 15, 19],[32, 24, 19, 28]])
median_value = np.median(arr) 

# Example 3: Use numpy median() along axis 0
# Get the median value of a row
arr1 =  np.median(arr, axis = 0)

# Example 4: Use numpy median() along axis 1
# Get the median value of a column
arr2 = np.median(arr, axis = 1)

# Example 5: Use numpy.median() function 
# To set out parameter
median=np.zeros(np.median(arr,axis=1).shape)
arr2 = np.median(arr,axis=1,out=median)

2. Syntax of median()

Following is the syntax of the numpy.median() function.


# Syntax of numpy.median() 
numpy.median(arr, axis=None, out=None, overwrite_input=False, keepdims=False)

2.1 Parameters of median()

  • arr – The input array or object that defines the data to be used in the calculation.
  • axis – This parameter defines the axis along which medians are computed. By default, the median is computed of the flattened array. axis = 0 means along the row and axis = 1 means working along the column.
  • out – (Optional) An alternative output array in which to place the result. If not specified, a new array is created.
  • overwrite_input – It is a boolean and optional. If True, then allow the use of memory of input array a for calculations. The input array will be modified by the call to the median() function.
  • keepdims – It is a boolean and optional. If this is set to True, the axes that are reduced are left in the result as dimensions with size one. The default is False.

2.2 Return Value of median()

This function returns the median of the array or an array with medians along the specified axis.

3. Usage of NumPy median() Function

The numpy.median() function in the NumPy library is indeed used to calculate the median value along a specified axis in both single-dimensional and multi-dimensional arrays. The median is the middle value in a dataset, and this function is useful for obtaining a measure of central tendency that is less sensitive to outliers compared to the mean.

The following steps are shown how to calculate the median value

  • Given data points.
  • Arrange data in ascending order.
  • If the total number of terms is odd, then the median value is equal to the middle term.
  • If the total number of terms is even, then the median value is equal to the average of the terms in the middle.

3.1 Get the Median Value of 1-D NumPy Array

To get the median value of a one-dimensional NumPy array, you can use the numpy.median() function of it. First, create the 1-D NumPy array and pass this as input to the median() function.

In the below example, arr is a 1-dimensional NumPy array, and np.median(arr) calculates the median value. The result is then printed to the console. As I said above, it first sorts the elements internally and returns the middle value.


# Import NumPy Module
import numpy as np

# Create a NumPy array
arr = [12, 7, 15, 8, 9, 5, 3]
print("Original array:",arr)

# Compute the median value of 1D array
median_value = np.median(arr)
print("Median Value:", median_value)

Yields below output.

numpy median

4. Get the Median value of the Multi-Dimensional Array

You can create a 2D NumPy array and then use np.median() it to calculate the median value of the entire array. This program initializes a 2D array arr and then calculates the median value using np.median(). The result is printed to the console.

When used a median() on the multi-dimensional NumPy array, it by default returns the middle values of all elements reason being by default, the median is computed of the flattened array. In the following example, 14 and 15 are middle values hence, it returns 14.5 which is the average of these two values.


# Create 2D numpy array
arr = np.array([[5, 9, 7, 11], [8, 14, 15, 19],[32, 24, 19, 28]])
print("Original 2D array:\n",arr)

# Get the median value of 2D array
median_value = np.median(arr) 
print("Median value of the 2D array:\n", median_value)

Yields below output.

numpy median

Keep in mind that np.median(arr) calculates the median across all elements in the array, treating it as a flattened 1D array. If you want to compute the median along a specific axis (e.g., along columns or rows), you can use the axis parameter, as shown in previous examples.

5. Get Median Value of Array Along with axis

If you want to calculate the median value along a specific axis in a 2D NumPy array, you can use the axis parameter with the numpy.median() function. In this program, np.median(arr, axis=0) calculates the median along columns (axis 0), and np.median(arr, axis=1) calculates the median along rows (axis 1). In the following example, I have demonstrated these two examples.


# Use numpy median() along axis 0
# Get the median value of row
arr1 =  np.median(arr, axis = 0)
print("Median along axis 0 (columns):\n", arr1)

# Output:
# Median along axis 0 (columns):
#  [ 8. 14. 15. 19.]

# Use numpy median() along axis 1
# Get the median value of column
arr2 = np.median(arr, axis = 1)
print("Median along axis 1 (rows):\n", arr2)

# Output:
# Median along axis 1 (rows):
#  [ 8.  14.5 26. ]

6. Function To Set Out Parameter

The out parameter in numpy.median() is used to specify an output array where the result should be placed. However, since you want to calculate the median along the rows (axis=1), you need to ensure that the out array has the correct shape.

In this modification, I set the median array to have the correct shape for the rows of the original array. This ensures that the out parameter works as expected. Now, arr2 contains the median values along axis 1 (rows) of the original 2D array.


# Create 2D numpy array
arr = np.array([[5, 9, 7, 11], [8, 14, 15, 19],[32, 24, 19, 28]])
print("Original 2D array:\n",arr)

# Use numpy.median() function to set out parameter
median=np.zeros(np.median(arr,axis=1).shape)
arr2 = np.median(arr,axis=1,out=median)
print("Median value:\n", arr2)

# Output:
# Median value:
#  [ 8.  14.5 26. ]

Frequently Asked Questions

How do I calculate the median of a 1D NumPy array?

Calculating the median of a 1D NumPy array is straightforward. You can use the numpy.median() function. For example, data is a 1D array, and np.median(data) calculates the median value

How can I calculate the median along a specific axis in a 2D array?

To calculate the median along a specific axis in a 2D NumPy array, you can use the axis parameter of the numpy.median() function. The axis parameter allows you to specify the axis along which the median should be calculated.

How do I use the out parameter with numpy.median()?

The out parameter in NumPy functions allows you to specify an array where the result should be placed. When using numpy.median(), you can utilize the out parameter to store the result in a pre-allocated array.

How do I handle NaN values when calculating the median?

To handle NaN (Not a Number) values when calculating the median in NumPy, you can use the numpy.nanmedian() function. This function calculates the median while ignoring any NaN values in the input array.

Can I calculate the median of a flattened array?

You can calculate the median of a flattened array using the numpy.median() function. By default, numpy.median() operates on the flattened array, treating it as a one-dimensional array.

Can I calculate the median of a masked array?

You can calculate the median of a masked array using the numpy.ma.median() function. Masked arrays are arrays in which certain elements are marked as invalid or not to be used in calculations.

Conclusion

In this article, I have explained how to use numpy.median() function to compute the median value of an array along with the specified axis. The term median is the value separating the higher half from the lower half of a data sample in other words median is a value in the middle when you sorted the values in ascending order.

Happy Learning!!

References

Malli

Malli is an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly articles. Over the years, he has written hundreds of articles in Pandas, NumPy, Python, and takes pride in ability to bridge the gap between technical experts and end-users.

Leave a Reply