In Pandas, the min()
method is used to find the minimum value over the specified axis. By default, it operates column-wise, meaning it returns the minimum value for each column. You can also use it to find the minimum value across rows by changing the axis.
In this article, I will explain the Pandas DataFrame min()
method by using its syntax, parameters, and usage to return the smallest value among in the given data along the specified axis.
Key Points –
- The
min()
method returns the minimum value(s) from the DataFrame, either across a specified axis or for each column by default. - The method allows for specifying the axis (
axis=0
for columns andaxis=1
for rows) over which to find the minimum values. - By default, the method ignores NaN values when computing the minimum. This behavior can be adjusted using the
skipna
parameter. - The return type of the
min()
method is a Series object containing the minimum values along the specified axis. - The
level
parameter allows the computation of the minimum to be performed on a particular level of a MultiIndex (hierarchical index), returning a Series with the minimum values for each group.
Pandas DataFrame min() Introduction
Let’s know the syntax of the min() method.
# Syntax of DataFrame min()
DataFrame.min(axis=0, skipna=True, level=None, numeric_only=None, **kwargs)
Parameters of the Pandas DataFrame min()
Following are the parameters of the DataFrame min() method
axis
– {0 or ‘index’, 1 or ‘columns’}, default 00
orindex
: Compute the minimum for each column.1
orcolumns
: Compute the minimum for each row.
skipna
– bool, defaultTrue
.- Exclude NA/null values when computing the result.
level
– int or level name, defaultNone
.- If the axis is a MultiIndex (hierarchical), compute the minimum along a particular level, collapsing into a Series.
numeric_only
– bool, defaultNone
.- Include only float, int, and boolean data. If
None
, it will attempt to use everything and then only numeric data. This parameter is available starting from version 1.5.0.
- Include only float, int, and boolean data. If
**kwargs
– Additional arguments for compatibility; not used in the current context.
Return Value
It returns the minimum of the values over the requested axis.
Usage of Pandas DataFrame min() Method
The Pandas DataFrame min()
method is used to compute the minimum value across a specified axis (either rows or columns) of a DataFrame. This method is particularly useful for quickly finding the smallest values in each column or row, especially when working with large datasets.
Now, let’s create a Pandas DataFrame using data from a dictionary.
import pandas as pd
import numpy as np
# Creating a sample DataFrame
data = {
'A': [4, 38, 8, 24, 15],
'B': [52, 5, 49, 18, 31],
'C': [9, 13, 84, 53, 22]
}
df = pd.DataFrame(data)
print("Original DataFrame:\n",df)
Yields below output.
Finding the Minimum Value in Each Column
Alternatively, to find the minimum value in each column of a DataFrame, you can use the min()
method without specifying an axis. By default, this method computes the minimum value across each column.
# Finding the minimum value in each column
df2 = df.min()
print("Minimum value in each column:\n", df2)
In this example, the minimum value found in each column (A
, B
, and C
) of the DataFrame.
Finding the Minimum Value in Each Row
To find the minimum value in each row of a DataFrame, you can use the min()
method with the axis=1
parameter. This will compute the minimum value across all columns for each row.
# Finding the minimum value in each row
df2 = df.min(axis=1)
print("Minimum value in each row:\n", df2)
# Output:
# Minimum value in each row:
# 0 4
# 1 5
# 2 8
# 3 18
# 4 15
# dtype: int64
In this output, the minimum value for each row in the DataFrame, with each value representing the smallest number found across all columns for that particular row.
Handling Missing Values (NaN)
When working with DataFrames that contain missing values (NaN
), the min()
method by default ignores these missing values when calculating the minimum. However, you can control this behavior using the skipna
parameter.
Using skipna=True
(which is the default behavior), the min()
method will ignore any NaN values in the DataFrame when calculating the minimum value.
import pandas as pd
import numpy as np
# Creating a DataFrame with missing values
data = {
'A': [4, np.nan, 8, 24, 15],
'B': [52, 5, np.nan, 18, 31],
'C': [9, 13, 84, np.nan, 22]
}
df = pd.DataFrame(data)
# Finding the minimum value in each column,
# Ignoring NaN values by default
df2 = df.min(skipna=True)
print("Minimum value in each column (ignoring NaN):\n", df2)
# Output:
# Minimum value in each column (ignoring NaN):
# A 4.0
# B 5.0
# C 9.0
# dtype: float64
Similarly, if you want to calculate the minimum value in each column or row of a DataFrame and include NaN
values in the computation (meaning that if any NaN
is present, the result will be NaN
), you can set the skipna
parameter to False
.
# Finding the minimum value in each column,
# Including NaN values
df2 = df.min(skipna=False)
print("Minimum value in each column (including NaN):\n", df2)
# Output:
# Minimum value in each column (including NaN):
# A NaN
# B NaN
# C NaN
# dtype: float64
Finding Minimum Values in Numeric Columns Only
Finally, to find the minimum values in numeric columns only, you can use the min()
method in conjunction with the numeric_only=True
parameter. This ensures that only numeric data types (integers, floats, and booleans) are considered when computing the minimum values, excluding any non-numeric columns.
import pandas as pd
import numpy as np
# Creating a DataFrame with mixed data types
data_mixed = {
'A': [4, 38, 8, 24, 15],
'B': ['Spark', 'C++', 'Pandas', 'Java', 'Python'],
'C': [9.5, 13.1, 84.2, 53.4, 22.8]
}
df = pd.DataFrame(data_mixed)
print("DataFrame with mixed data types:\n", df)
# Finding the minimum values considering only numeric data types
df2 = df.min(numeric_only=True)
print("Minimum value in numeric columns only:\n", df2)
# Output:
# DataFrame with mixed data types:
# A B C
# 0 4 Spark 9.5
# 1 38 C++ 13.1
# 2 8 Pandas 84.2
# 3 24 Java 53.4
# 4 15 Python 22.8
# Minimum value in numeric columns only:
# A 4.0
# C 9.5
# dtype: float64
FAQ on Pandas DataFrame min() Method
The min()
method returns the minimum value along a specified axis of a DataFrame. By default, it computes the minimum value for each column (i.e., the minimum of all rows for each column).
To find the minimum value in each column, use the min()
method without specifying the axis
parameter (default behavior).
By default, min()
ignores NaN
values when computing the minimum. If you want to include NaN
values in the calculation, which will result in NaN
if any are present, use skipna=False
.
The skipna
parameter controls whether NaN
values are ignored (skipna=True
, default) or included (skipna=False
).
The numeric_only
parameter ensures that only numeric columns are considered for the minimum computation. Non-numeric columns are excluded from the results.
Conclusion
In conclusion, the Pandas DataFrame min()
method is a powerful and flexible tool for identifying the minimum values in your dataset. It plays a crucial role in data analysis by providing insights into the smallest values within specified axes of a DataFrame.
Happy Learning!!
Related Articles
- Pandas DataFrame round() Method
- Pandas DataFrame cov() Method
- Pandas DataFrame ffill() Method
- Pandas DataFrame max() Function
- Pandas DataFrame any() Method
- Pandas DataFrame mode() Method
- Pandas DataFrame corrwith() Method
- Pandas DataFrame rank() Method
- Pandas DataFrame mask() Method
- Pandas DataFrame mad() Method
- How to Unpivot DataFrame in Pandas?
- Pandas DataFrame copy() Function