NumPy Variance Function in Python

NumPy array var() function in Python is used to compute the arithmetic variance of the array elements along with the specified axis or multiple axes. We get the Variance by calculating the sum of all values in a Numpy array divided by the total number of values.

1. Quick Examples of Variance Function

If you are in a hurry, below are some quick examples of the NumPy variance function.


# Quick examples of variance function 

# Example 1: Use numpy.var() function
arr = np.array([2, 7, 5, 8, 9, 4])
arr2 = np.var(arr) 

# Example 2: GCalculate the variance 
# Using numpy.var() function
arr1 = np.var(arr)

# Example 3: Get the var() value of an array 
# With specified datatype
arr1 = np.var(arr, dtype = np.float32)   

# Example 4: Calculate the variance using numpy.var() 
# Specify the data type as float64
arr2 = np.var(arr, dtype=np.float64)

# Example 5: Get the var() with 2-D array
arr = np.array([[3, 5, 7, 9], [2, 4, 6, 8]])
arr2 = np.var(arr) 

# Example 6: Get the var() values over row 
# for each of 4 columns 
arr2 = np.var(arr, axis = 0)   

# Example 7: Get the var() values over column 
# for each of 2 rows 
arr2 = np.var(arr, axis = 1)

2. Syntax of NumPy var() Function

Following is the syntax of the numpy.var() function.


# Syntax of numpy.var()
numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=, *, where=)

2.1 Parameters of var()

Following are the parameters of the var() function.

arr – array_like: An array containing elements whose variance is desired. If arr is not an array, a conversion is attempted.
axis – [None or int or tuple of ints, optional]: Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array. axis = 0 means variance along the column and axis = 1 means variance along the row.
dtype – It is an optional parameter that specifies the data type we desire while computing the variance. Default is float64 for arrays of integer type. For arrays of float types, it is the same as the array type.
out – It is an optional parameter, An alternate output array must have the same dimensions as the expected output. But the type is cast if necessary.
keepdims – If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. For default value, keepdims will not be passed through to the var() method of sub-classes of ndarray.
where – Elements to include in the variance, the value should be array_like of bool, optional.

2.2 Return Value of NumPy var()

It returns the arithmetic variance of the array (a scalar value if the axis is none) or an array with variance values along the specified axis.

3. Use numpy.var() Function

You can use the numpy.var() function to calculate the sample variance of the arr array, which contains integers from 0 to 14.

The np.var() function is applied to the arr array, and it calculates the sample variance using the default behavior (where ddof=0). It then prints the calculated sample variance value.


# Import numpy
import numpy as np

# Create array 
arr = np.arange(15) 

# Use numpy.var() function
arr2 = np.var(arr) 
print("Sample variance value:\n",arr2)

Yields below output.

4. Get NumPy var() of 1-D Array

You can calculate the variance of a 1-D array using the numpy.var() function. For example, the numpy.var() function is used to calculate the variance of the 1-D array arr. The calculated variance will be printed as the output. You can replace the values in the arr array with any set of numbers for which you want to calculate the variance.


import numpy as np

# Create a 1-D array
arr = np.array([2, 7, 5, 8, 9, 4])
print("Original array:\n", arr)

# Calculate the variance 
# Using numpy.var() function
arr1 = np.var(arr)
print("Variance of the 1-D array:\n",arr1)

Yields below output.

5. Use Datatype Param

Let’s use the dtype parameter to specify the result variance data type. The result has a lower resolution if you use float32 data type rather than the default float64.

You can use the dtype parameter in the numpy.var() function to specify the data type of the result. In this program, the dtype=np.float64 parameter specifies that the result should have a data type of float64. You can change the data type to any valid NumPy data type according to your requirements.


import numpy as np

# Create a 1-D array
arr = np.array([2, 7, 5, 8, 9, 4])
print("Original array:\n", arr)

# Get the var() value of an array 
# With specified datatype
arr1 = np.var(arr, dtype = np.float32)   
print("Variance of the Data type:\n",arr1)

# Output:
# Variance of the Data type:
#  5.805556

# Calculate the variance using numpy.var() 
# Specify the data type as float64
arr2 = np.var(arr, dtype=np.float64)
print("Variance of the Data type:\n",arr2)

# Output:
# Variance of the Data type:
#  5.8055555555555545

6. Get NumPy var() With 2-D Array

You can calculate the variance of a 2-D array using the numpy.var() function. For instance, the numpy.var() function is applied to the 2-D array arr. By default, it calculates the variance along the entire array. The calculated variance will be printed as the output.


import numpy as np

# Create 2-D array
arr = np.array([[3, 5, 7, 9], [2, 4, 6, 8]])

# Get the var() with 2-D array
arr2 = np.var(arr)  
print("Variance of the 2-D array:\n",arr2)

# Output:
# Variance of the 2-D array:
#  5.25

7. Get the Variance With 2-D NumPy Array along Axis

You can also compute the variance of a NumPy array along with a specified axis. If you want to compute the variance of each row, you will pass the axis=0 parameter through the var() function. Similarly, to compute the variance of each column, use axis=1.


# Get the var() values over row 
# For each of 4 columns 
arr2 = np.var(arr, axis = 0)   
print("Variance along columns:\n",arr2) 

# Output:
# Variance along columns:
#  [0.25 0.25 0.25 0.25]
 
# Get the var() values over column 
# For each of 2 rows 
arr2 = np.var(arr, axis = 1)
print("Variance along rows:\n",arr2) 

# Output:
# Variance along rows:
#  [5. 5.]

Frequently Asked Questions

What is the purpose of the NumPy variance function?

The NumPy variance function, numpy.var(), is used to calculate the variance of a dataset. Variance measures the dispersion or spread of a set of values. In statistics, it is the average of the squared differences from the Mean.

Can numpy.var() handle multi-dimensional arrays?

numpy.var() can handle multi-dimensional arrays. You can specify the axis along which the variance is calculated for multi-dimensional arrays.

How can I calculate the standard deviation using numpy.var()?

The standard deviation is the square root of the variance. You can calculate it using numpy.sqrt() function after finding the variance using numpy.var().

What is variance in statistics?

Variance is a measure of how much the values in a dataset vary. It quantifies the spread or dispersion of a set of values. In the context of statistics, variance is the average of the squared differences from the Mean.

Can numpy.var() handle multi-dimensional arrays?

numpy.var() can handle multi-dimensional arrays in Python. You can calculate the variance along specific axes or for the entire array depending on your requirements. When you pass a multi-dimensional array to numpy.var(), you can specify the axis parameter to compute the variance along a particular axis.

How do I calculate variance along a specific axis using numpy.var()?

To calculate the variance along a specific axis of a multi-dimensional array using numpy.var(), you can use the axis parameter. The axis parameter allows you to specify the axis or axes along which the variance will be calculated.

Conclusion

In this article, I have explained how to calculate the arithmetic variance of the NumPy array along with the specified axis and multiple axes. Also explained how to use dtype optional param to change the return data type.

Happy Learning!!

References

https://numpy.org/doc/stable/reference/generated/numpy.nanmean.html#numpy.nanmean