How To Compute Standard Deviation in NumPy

To find the standard deviation of an array in Python use numpy.std() function. The standard deviation is the square root of the average of the squared deviations from the mean. By default, it is calculated for the flattened array but you can change this by specifying axis param.

In order to calculate the standard deviation first, you need to compute the average of the NumPy array by using x.sum()/N, and here, N=len(x) which results in the mean value. now to calculate std use, std = sqrt(mean(x)), where x = abs(arr - arr.mean())**2

1. Quick Examples of Python NumPy Standard Deviation Function

If you are in a hurry, below are some quick examples of the standard deviation of the NumPy Array with examples.


# Below are the quick examples

# Example 1: Use std() on 1-D array
arr1 = np.std(arr)

# Example 2: Use std() on 2-D array
arr1 = np.std(arr)

# Example 3: Get the standard deviation of with axis = 0
arr1 = np.std(arr, axis=0)

# Example 4: Get the standard deviation of with axis = 1 
arr1 = np.std(arr, axis=1)

# Example 5: Get the standard deviation value with float32 data 
arr1 = np.std(arr, dtype = np.float32)

2. Syntax of std()

Following is the syntax of std().


#  Syntax of numpy.std() 
numpy.std(arr, axis=None, dtype=None, out=None) 

2.1 Parameters of std()

Following are the parameters of std().

  • arr – Input array to calculate.
  • axis – None, int or tuple of int. Axis or axes. The default is to compute the standard deviation of the flattened array. axis=0 means standard deviation computed along the column, axis=1 means standard deviation along the row. It treats the multiple dimension array as a flattened list if axis is not given.
  • dtype – Type you desire while computing the standard deviation.
  • out – Alternative output array in which to place the result. 

2.2 Return Value of std()

It returns the standard deviation of array elements with float64 data type. You can change this by specifying dtype param.

3. Usage of NumPy std()

NumPy std() is a statistical function used to compute the standard deviation of single and multi-dimensional arrays along with specified axis and specified datatype.

Lets Create NumPy array using np.array() function and calculate the standard deviation using numpy.std() function. For example,


import numpy as np
# Create NumPy array
arr = np.array([5,6,4])

# Get the Standard Deviation of 1-dimensional array
arr1 = np.std(arr)
print(arr1)

# Output :
# 0.816496580927726

Following is the mathematical calculation of the Standard Deviation of the 1-D Array.


# Mathematical calculation of standard deviation
Standard Deviation is std =  sqrt(mean(x)), where x = abs(arr - arr.mean())**2
Mean = 5 + 6 + 4 / 3
     = 5

Standard Deviation = sqrt( ((5-5)**2 + (6-5)**2 + (4-5)**2)/3 )
                   = sqrt( (0+ 1+ 1)/3 )
                   = sqrt(2/3)
                   = sqrt(0.6666)
                   = 0.816496580927726

4. Get the Standard Deviation of 2-D Array

To find the standard deviation of a 2-D array, use this function without passing any axis, it will calculate all the values in an array and return the std value.


# Create a 2-D numpy array
arr = np.array([[2, 3],
                [2, 5]])
                
# Get the standard deviation of with no axis
arr1 = np.std(arr)
print(arr1)

# Output:
# 1.224744871391589

Following is the mathematical calculation of the Standard Deviation of the 2-D Array.


# Mathematical calculation of standard deviation
Mean = 2 + 3 + 2 + 5 / 4
     = 3

Standard Deviation = sqrt( ((2-3)**2 + (3-3)**2 + (2-3)**2 + (5-3)**2)/4 )
                   = sqrt( (1+ 0+ 1+ 4)/4 )
                   = sqrt(6/4)
                   = sqrt(1.5)
                   = 1.224744871391589

5. Get the Standard Deviation using axis Param

We can compute the standard deviation of the NumPy array along with the specified axis. For that, we need to pass the axis = 0 parameter to calculate by column-wise. For example,


# Get the standard deviation of array in column-wise 
arr1 = np.std(arr, axis=0)
print(arr1) 

# Output:
# [0. 1.]

Below is how it calculates internally.


# Mathematical calculation of standard deviation
1st column values are 2, 2
mean = (2+2)/2 = 0

Standard Deviation = sqrt( ( (2-2)**2 + (2-2)**2 )/2 )
                   = sqrt( 0 + 0/2 )
                   = sqrt(0/2)
                   = 0.

2nd column values are 3, 5
mean = (3+5)/2 = 4

Standard Deviation = sqrt( ( (3-4)**2 + (5-4)**2 )/2 )
                   = sqrt( 1 + 1/2 )
                   = sqrt(2/2)
                   = 1.

Use np.std(arr,axis=1) to calculate the standard deviation row-wise of the array. For example,


# Standard deviation of array row-wise
arr1 = np.std(arr, axis=1)
print(arr1)

# Output:
# [0.5 1.5]

The mathematical calculation is the same as above, I will level this for you to explore.

6. Using dtype Param

As you see above examples by default it returns float64 but you can change this by passing dtype parameter to this function, it has a lower resolution if you assign dtype with float32 rather than float64.


# Get the standard deviation value with float32 data 
arr = np.array([5,6,4])
arr1 = np.std(arr, dtype = np.float32)
print(arr1)

# Output:
# 0.8164966

7. Conclusion

In this article, I have explained the standard deviation of NumPy array single-dimensional and multi-dimensional using std() function with detailed examples.

Happy Learning!!

References

Vijetha

With 5 of experience in technical writing, I have had the privilege to work with a diverse range of technologies like Python, Pandas, NumPy and R. During this time, I have consistently demonstrated my ability to grasp intricate technical details and transform them into comprehensible materials.

Leave a Reply

You are currently viewing How To Compute Standard Deviation in NumPy