NumPy nanmean() - Get Mean ignoring NAN Values

Python NumPy nanmean() function is used to compute the arithmetic mean or average of the array along a specified axis while ignoring NaN (Not a Number) values. If the array has a NaN value and you can find out the average without being influenced by the NaN value. The mean/average is taken over the flattened array by default, otherwise over the specified axis.

1. Quick Examples of NumPy nanmean() Function

If you are in a hurry, below are some quick examples of how to use nanmean() function in NumPy Python.


# Quick examples of numpy nanmean() function

# Example 1: Get the nanmean of 2D array
arr = np.array([[8, np.nan, 16], [6, 12, 25]])
arr2 = np.nanmean(arr)

# Example 2: Use mean() array 
# Without nanmean() function 
arr2 = np.mean(arr)

# Example 3: Calculate the nanmean along axis 0 (column-wise)
# Ignoring NaN values
arr = np.array([[24, 32, 85],[57, np.nan, 16],[8, 17, np.nan],[43, 78, 39]])
arr2 = np.nanmean(arr, axis=0)

# Example 4: Calculate the mean along axis 0 (columns)
arr2 = np.mean(arr, axis=0)

# Example 5: Calculate the nanmean along axis 1 (row-wise)
# Ignoring NaN values
arr2 = np.nanmean(arr, axis=1)

# Example 6: Calculate the mean along axis 1 (rows)
arr2 = np.mean(arr, axis = 1)

2. Syntax of numpy.nanmean() Function

Following is the syntax of numpy.nanmean() function.


# Syntax of numpy.nanmean()
numpy.nanmean(arr, axis=None, dtype=None, out=None, keepdims=))

2.1 Parameter of numpy.nanmean()

Following are the parameters of the nanmean() function.

arr – The input array for which you want to calculate the mean.
axis – Axis or axes along which the means are computed. The default is to compute the mean of the flattened array. If specified, it should be an integer or a tuple of integers. You can use axis=1 means row-wise or axis=0 means column-wise.
dtype (optional) – Data-type to force. If specified, the result is cast to this data type. If the array contains non-numeric elements, the data type should be chosen appropriately. Type to be used during the calculation of the arithmetic mean. For integer inputs, the default is float64.
out : Alternate output array in which to place the result.
keepdims : If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.

2.2 Return Value

The numpy.nanmean() function returns the arithmetic mean or average of the input array, excluding NaN values. If the array contains only NaN values or if the calculation results in NaN, then NaN is returned.

3. Usage of NumPy nanmean() Function

The NumPy nanmean() function calculates the arithmetic mean or average of an array, excluding NaN values. By default, the mean is taken over the flattened array, but you can specify the axis for multi-dimensional arrays. It is particularly useful when working with datasets that may contain missing or undefined values.

The numpy.nanmean() function is very similar to the numpy.mean() function in its arguments. When you use numpy.nanmean() function is used to calculate the mean of a Numpy array containing NaN values. By not specifying the axis, it will return the mean of all the values inside the array.


# Import numpy
import numpy as np

# Create 2D array with NaN values
arr = np.array([[8, np.nan, 16], [6, 12, 25]])
print("Original 2D array:\n",arr)

# Get the nanmean of 2-D array
arr2 = np.nanmean(arr)
print("Mean (ignoring NaN values):\n", arr2)

Yields below output.

The mean of the array, excluding the NaN values, is calculated, and the result is printed. In this case, the mean is approximately 13.4. The NaN values are ignored in the calculation. If you were to calculate the mean using the regular np.mean() function, NaN values would affect the result, but np.nanmean() excludes them.

4. Get the nanmean() Values of 2-D Array along Axis = 0

If you want to calculate the mean along axis 0 (column-wise) for a 2-D array while ignoring NaN values, you can use the numpy.nanmean() function. For instance, np.nanmean(arr, axis=0) calculates the mean along axis 0, which is column-wise. The resulting mean values are printed.


import numpy as np

# Create 2D array with NaN values
arr = np.array([[24, 32, 85],[57, np.nan, 16],[8, 17, np.nan],[43, 78, 39]])
print("Original array:\n",arr)

# Calculate the nanmean along axis 0 (column-wise)
# Ignoring NaN values
arr2 = np.nanmean(arr, axis=0)
print("Mean along axis 0 (column-wise, ignoring NaN values):\n", arr2)

# Output:
# Original array:
#  [[24. 32. 85.]
#  [57. nan 16.]
#  [ 8. 17. nan]
#  [43. 78. 39.]]
# Mean along axis 0 (column-wise, ignoring NaN values):
#  [33.         42.33333333 46.66666667]

# Calculate the mean along axis 0 (columns)
arr2 = np.mean(arr, axis=0)
print("Mean along axis 0 (columns):\n",arr2)

# Output:
# Mean along axis 0 (columns):
#  [33. nan nan]

5. Get the nanmean() Values of 2-D Array along Axis = 1

To calculate the mean along axis 1 (row-wise) for a 2-D array while ignoring NaN values, you can use the numpy.nanmean() function. For example, np.nanmean(arr, axis=1) calculates the mean along axis 1, which is row-wise. The resulting mean values for each row are printed.


# Calculate the nanmean along axis 1 (row-wise)
# Ignoring NaN values
arr2 = np.nanmean(arr, axis=1)
print("Mean along axis 1 (row-wise, ignoring NaN values):\n", arr2)

# Output:
# Original array:
#  [[24. 32. 85.]
#  [57. nan 16.]
#  [ 8. 17. nan]
#  [43. 78. 39.]]
# Mean along axis 1 (row-wise, ignoring NaN values):
#  [47.         36.5        12.5        53.33333333]

# Calculate the mean along axis 1 (rows)
arr2 = np.mean(arr, axis = 1)
print("Mean along axis 1 (rows):\n",arr2)

# Output:
# Mean along axis 1 (rows):
#  [47.                 nan         nan 53.33333333]

Frequently Asked Questions

What is nanmean() in NumPy?

The nanmean() function calculates the mean of an array along a specified axis, excluding NaN (Not a Number) values. This is particularly useful when dealing with datasets that may contain missing or undefined values.

How is nanmean() different from mean() in NumPy?

The mean() function in NumPy calculates the mean of an array, including NaN values in the computation, whereas nanmean() excludes NaN values from the calculation.

How do I use nanmean() with multi-dimensional arrays?

You can specify the axis along which the mean is calculated by providing the axis parameter. This allows you to calculate the mean along a specific axis while ignoring NaN values.

What happens if I use nanmean() on an array without NaN values?

If there are no NaN values in the array, nanmean() will behave the same as the regular mean() function, as it will calculate the mean without excluding any values.

Can I use nanmean() with non-numeric data types?

The nanmean() function in NumPy is designed for numeric arrays. It cannot be used with non-numeric data types. If you attempt to use nanmean() with an array containing non-numeric data, you will likely encounter an error.

Does nanmean() modify the original array?

The nanmean() function in NumPy does not modify the original array. It operates on the array and returns a new array containing the mean values, while leaving the original array unchanged.

Conclusion

In this article, I have explained how to use the NumPy nanmean() function in Python which calculates the arithmetic mean/average of the array along a specified axis while ignoring NaN (Not a Number) values with examples.

Happy Learning!!

References

https://numpy.org/doc/stable/user/index.html#user