In NumPy, you can compute the standard deviation of a set of values using the numpy.std()
function. The standard deviation is a measure of the amount of variation or dispersion in a set of values. By default, it is calculated for the flattened array but you can change this by specifying the axis param.
In order to calculate the standard deviation first, you need to compute the average of the NumPy array by using x.sum()/N
, and here, N=len(x)
which results in the mean value. now to calculate std use, std=sqrt(mean(x))
where x=abs(arr-arr.mean())**2
1. Quick Examples of Standard Deviation Function
If you are in a hurry, below are some quick examples of the standard deviation of the NumPy Array with examples.
# Quick examples of standard deviation
# Example 1: Compute the standard deviation
# Using 1-dimensional array
arr = np.array([5,6,4])
arr1 = np.std(arr)
# Example 2: Get the standard deviation
# Of with no axis
arr = np.array([[2, 3],[2, 5]])
arr1 = np.std(arr)
# Example 3: Get the standard deviation
# Of an array in column-wise
arr = np.array([[2, 3],[2, 5]])
arr1 = np.std(arr, axis=0)
# Example 4: Standard deviation
# Of array row-wise
arr = np.array([[2, 3],[2, 5]])
arr1 = np.std(arr, axis=1)
# Example 5: Get the standard deviation value
# With float32 data
arr = np.array([5, 6, 4])
arr1 = np.std(arr, dtype = np.float32)
# Example 6: Calculate the standard deviation
# With a specified data type (e.g., np.float64)
arr2 = np.std(arr, dtype=np.float64)
2. Syntax of std()
Following is the syntax of std().
# Syntax of numpy.std()
numpy.std(arr, axis=None, dtype=None, out=None)
2.1 Parameters of std()
Following are the parameters of std().
arr
– This is the input array for which you want to calculate the standard deviation.axis
– None, int or tuple of int. Axis or axes. The default is to compute the standard deviation of the flattened array.axis=0
means standard deviation computed along the column,axis=1
means standard deviation along the row. It treats the multiple dimension array as a flattened list if axis is not given.dtype
– This is an optional parameter that specifies the data type used in computing the standard deviation. If not specified, it uses the data type of the input array.out
– Alternative output array in which to place the result.
2.2 Return Value of std()
It returns the standard deviation of array elements with float64
data type. You can change this by specifying dtype
param.
3. Usage of NumPy std()
The numpy.std()
function is indeed a statistical function in the NumPy library used for computing the standard deviation of arrays. It supports both single-dimensional and multi-dimensional arrays, and you can specify the axis along which the standard deviation is calculated, as well as the data type of the result.
You can use numpy.std()
to calculate the standard deviation of a 1-dimensional NumPy array. For instance, first, import the NumPy library as np
. Create a 1-dimensional NumPy array called arr
with elements [5, 6, 4]. Use numpy.std()
to calculate the standard deviation of the array.
# Import NumPy Module
import numpy as np
# Create NumPy array
arr = np.array([5,6,4])
print("Original array:",arr)
# Compute the standard deviation
# Using 1-dimensional array
arr1 = np.std(arr)
print("Standard Deviation:",arr1)
Yields below output.
Following is the mathematical calculation of the Standard Deviation of the 1-D Array.
# Mathematical calculation of standard deviation
Standard Deviation is std = sqrt(mean(x)), where x = abs(arr - arr.mean())**2
Mean = 5 + 6 + 4 / 3
= 5
Standard Deviation = sqrt( ((5-5)**2 + (6-5)**2 + (4-5)**2)/3 )
= sqrt( (0+ 1+ 1)/3 )
= sqrt(2/3)
= sqrt(0.6666)
= 0.816496580927726
4. Get the Standard Deviation of 2D Array
To use numpy.std()
to calculate the standard deviation of a 2D NumPy array without specifying the axis. For instance, you import the NumPy library as np
. Create a 2D NumPy array called arr
with elements [[2, 3], [2, 5]]. Use numpy.std()
to calculate the standard deviation of the entire array (no axis specified). it will calculate all the values in an array and return the std value.
# Create a 2D numpy array
arr = np.array([[2, 3],[2, 5]])
print("Original array:\n",arr)
# Get the standard deviation of with no axis
arr1 = np.std(arr)
print("Standard Deviation of the entire array:\n",arr1)
Yields below output.
Following is the mathematical calculation of the Standard Deviation of the 2-D Array.
# Mathematical calculation of standard deviation
Mean = 2 + 3 + 2 + 5 / 4
= 3
Standard Deviation = sqrt( ((2-3)**2 + (3-3)**2 + (2-3)**2 + (5-3)**2)/4 )
= sqrt( (1+ 0+ 1+ 4)/4 )
= sqrt(6/4)
= sqrt(1.5)
= 1.224744871391589
5. Get the Standard Deviation using axis Param
When you pass axis=0
to numpy.std()
, it calculates the standard deviation along the rows, i.e., column-wise. This means that it provides the standard deviation for each column independently.
This program calculates the standard deviation along the rows (axis=0) and prints the result, giving the standard deviation for each column. Remember, specifying axis=0
in NumPy functions often means performing the operation along the vertical axis, which corresponds to operating on columns in a 2D array.
# Create a 2D numpy array
arr = np.array([[2, 3],[2, 5]])
print("Original array:\n",arr)
# Get the standard deviation of array in column-wise
arr1 = np.std(arr, axis=0)
print("Standard Deviation along Rows:\n",arr1)
# Output:
# Standard Deviation along Rows:
# [0. 1.]
Below is how it calculates internally.
# Mathematical calculation of standard deviation
1st column values are 2, 2
mean = (2+2)/2 = 0
Standard Deviation = sqrt( ( (2-2)**2 + (2-2)**2 )/2 )
= sqrt( 0 + 0/2 )
= sqrt(0/2)
= 0.
2nd column values are 3, 5
mean = (3+5)/2 = 4
Standard Deviation = sqrt( ( (3-4)**2 + (5-4)**2 )/2 )
= sqrt( 1 + 1/2 )
= sqrt(2/2)
= 1.
If you want to calculate the standard deviation row-wise (along axis 1) for a 2D NumPy array, you can use np.std(arr,axis=1)
. This code calculates the standard deviation along the columns (axis=1), giving the standard deviation for each row.
# Standard deviation of array row-wise
arr1 = np.std(arr, axis=1)
print("Standard Deviation along Columns:\n",arr1)
# Output:
# Standard Deviation along Columns:
# [0.5 1.5]
The mathematical calculation is the same as above, I will level this for you to explore. The result shows the standard deviation calculated for each row. The axis=1
parameter specifies that the operation is performed along the horizontal axis, which corresponds to operating on rows in a 2D array.
6. Using dtype Param
If you want to specify the data type for the result when using numpy.std()
, you can use the dtype
parameter. By default, it returns float64
but you can change this by passing dtype
parameter to this function, it has a lower resolution if you assign dtype
with float32
rather than float64
.
# Create a 1D array
arr = np.array([5, 6, 4])
print("Original array:\n",arr)
# Get the standard deviation value with float32 data
arr1 = np.std(arr, dtype = np.float32)
print("Standard Deviation with Custom Data Type:",arr1)
# Output:
# Standard Deviation with Custom Data Type: 0.8164966
Similarly, the standard deviation is calculated for the 1D array arr
, and the dtype
parameter is used to specify that the result should have the data type np.float64
. You can replace np.float64
with the desired data type for your specific use case.
The dtype
parameter is particularly useful when you want to ensure that the result has a specific data type, different from the default data type of the input array.
# Calculate the standard deviation
# With a specified data type (e.g., np.float64)
arr2 = np.std(arr, dtype=np.float64)
print("Standard Deviation with Custom Data Type:", arr2 )
# Output:
# Standard Deviation with Custom Data Type: 0.816496580927726
Frequently Asked Questions
To calculate the standard deviation of a 1D array in NumPy, you can use the numpy.std()
function. For example, the numpy.std()
function is applied to the 1D array data
, and the result is stored in the variable std_dev
. The std_dev
variable now contains the standard deviation of the values in the array.
To calculate the standard deviation along a specific axis for a 2D array in NumPy, you can use the numpy.std()
function with the axis
parameter.
The default behavior of numpy.std()
regarding the axis
parameter is to calculate the standard deviation for the flattened array. In other words, if you don’t specify the axis
parameter, the function will treat the input array as if it were flattened into a 1D array, and it will compute the standard deviation for the entire flattened array.
You can specify the data type for the result of numpy.std()
using the dtype
parameter. The dtype
parameter allows you to set the desired data type for the output
To calculate the standard deviation for a sample rather than the entire population, you need to use the ddof
(degrees of freedom) parameter in the numpy.std()
function. The default value of ddof
is 0, which corresponds to calculating the standard deviation for the entire population. To calculate the sample standard deviation, set ddof
to 1.
The ddof
parameter in the numpy.std()
function stands for “degrees of freedom.” It is used to adjust the divisor in the calculation of the standard deviation. The purpose of the ddof
parameter is to provide flexibility in calculating the standard deviation for different scenarios, particularly when dealing with samples rather than the entire population.
Conclusion
In this article, I have explained the standard deviation of NumPy array single-dimensional and multi-dimensional using std()
function with detailed examples.
Happy Learning!!
Related Articles
- NumPy Sort Array
- numpy.delete() Function
- numpy.divide() Function
- NumPy percentile() Function
- Python NumPy Array Operations
- NumPy full() Function with Examples
- How to get the maximum NumPy array?
- NumPy Count Nonzero Values in Python
- How to get square values of an array?
- NumPy convolve() Function in Python
- .How To Compute Average Of NumPy Array?
- NumPy Empty Array With Examples
- NumPy fill() Function with Examples