Pandas DataFrame min() Method

In Pandas, the min() method is used to find the minimum value over the specified axis. By default, it operates column-wise, meaning it returns the minimum value for each column. You can also use it to find the minimum value across rows by changing the axis.

Pandas DataFrame min() Introduction

Let’s know the syntax of the min() method.


# Syntax of DataFrame min()
DataFrame.min(axis=0, skipna=True, level=None, numeric_only=None, **kwargs)

Parameters of the Pandas DataFrame min()

Following are the parameters of the DataFrame min() method

axis – {0 or ‘index’, 1 or ‘columns’}, default 0
- 0 or index: Compute the minimum for each column.
- 1 or columns: Compute the minimum for each row.
skipna – bool, default True.
- Exclude NA/null values when computing the result.
level – int or level name, default None.
- If the axis is a MultiIndex (hierarchical), compute the minimum along a particular level, collapsing into a Series.
numeric_only – bool, default None.
- Include only float, int, and boolean data. If None, it will attempt to use everything and then only numeric data. This parameter is available starting from version 1.5.0.
**kwargs – Additional arguments for compatibility; not used in the current context.

Return Value

It returns the minimum of the values over the requested axis.

Usage of Pandas DataFrame min() Method

The Pandas DataFrame min() method is used to compute the minimum value across a specified axis (either rows or columns) of a DataFrame. This method is particularly useful for quickly finding the smallest values in each column or row, especially when working with large datasets.

Now, let’s create a Pandas DataFrame using data from a dictionary.


import pandas as pd
import numpy as np

# Creating a sample DataFrame
data = {
    'A': [4, 38, 8, 24, 15],
    'B': [52, 5, 49, 18, 31],
    'C': [9, 13, 84, 53, 22]
}

df = pd.DataFrame(data)
print("Original DataFrame:\n",df)

Yields below output.

Finding the Minimum Value in Each Column

Alternatively, to find the minimum value in each column of a DataFrame, you can use the min() method without specifying an axis. By default, this method computes the minimum value across each column.


# Finding the minimum value in each column
df2 = df.min()
print("Minimum value in each column:\n", df2)

In this example, the minimum value found in each column (A, B, and C) of the DataFrame.

Finding the Minimum Value in Each Row

To find the minimum value in each row of a DataFrame, you can use the min() method with the axis=1 parameter. This will compute the minimum value across all columns for each row.


# Finding the minimum value in each row
df2 = df.min(axis=1)
print("Minimum value in each row:\n", df2)

# Output:
# Minimum value in each row:
# 0     4
# 1     5
# 2     8
# 3    18
# 4    15
# dtype: int64

In this output, the minimum value for each row in the DataFrame, with each value representing the smallest number found across all columns for that particular row.

Handling Missing Values (NaN)

When working with DataFrames that contain missing values (NaN), the min() method by default ignores these missing values when calculating the minimum. However, you can control this behavior using the skipna parameter.

Using skipna=True (which is the default behavior), the min() method will ignore any NaN values in the DataFrame when calculating the minimum value.


import pandas as pd
import numpy as np

# Creating a DataFrame with missing values
data = {
    'A': [4, np.nan, 8, 24, 15],
    'B': [52, 5, np.nan, 18, 31],
    'C': [9, 13, 84, np.nan, 22]
}

df = pd.DataFrame(data)

# Finding the minimum value in each column, 
# Ignoring NaN values by default
df2 = df.min(skipna=True)
print("Minimum value in each column (ignoring NaN):\n", df2)

# Output:
# Minimum value in each column (ignoring NaN):
# A    4.0
# B    5.0
# C    9.0
# dtype: float64

Similarly, if you want to calculate the minimum value in each column or row of a DataFrame and include NaN values in the computation (meaning that if any NaN is present, the result will be NaN), you can set the skipna parameter to False.


# Finding the minimum value in each column, 
# Including NaN values
df2 = df.min(skipna=False)
print("Minimum value in each column (including NaN):\n", df2)

# Output:
# Minimum value in each column (including NaN):
# A   NaN
# B   NaN
# C   NaN
# dtype: float64

Finding Minimum Values in Numeric Columns Only

Finally, to find the minimum values in numeric columns only, you can use the min() method in conjunction with the numeric_only=True parameter. This ensures that only numeric data types (integers, floats, and booleans) are considered when computing the minimum values, excluding any non-numeric columns.


import pandas as pd
import numpy as np

# Creating a DataFrame with mixed data types
data_mixed = {
    'A': [4, 38, 8, 24, 15],
    'B': ['Spark', 'C++', 'Pandas', 'Java', 'Python'],
    'C': [9.5, 13.1, 84.2, 53.4, 22.8]
}

df = pd.DataFrame(data_mixed)
print("DataFrame with mixed data types:\n", df)

# Finding the minimum values considering only numeric data types
df2 = df.min(numeric_only=True)
print("Minimum value in numeric columns only:\n", df2)

# Output:
# DataFrame with mixed data types:
#     A       B     C
# 0   4   Spark   9.5
# 1  38     C++  13.1
# 2   8  Pandas  84.2
# 3  24    Java  53.4
# 4  15  Python  22.8
# Minimum value in numeric columns only:
# A    4.0
# C    9.5
# dtype: float64

FAQ on Pandas DataFrame min() Method

What does the min() method do?

The min() method returns the minimum value along a specified axis of a DataFrame. By default, it computes the minimum value for each column (i.e., the minimum of all rows for each column).

How do I find the minimum value in each column?

To find the minimum value in each column, use the min() method without specifying the axis parameter (default behavior).

What happens if there are missing values (NaN) in the DataFrame?

By default, min() ignores NaN values when computing the minimum. If you want to include NaN values in the calculation, which will result in NaN if any are present, use skipna=False.

How can I handle NaN values differently?

The skipna parameter controls whether NaN values are ignored (skipna=True, default) or included (skipna=False).

How does numeric_only parameter affect the result?

The numeric_only parameter ensures that only numeric columns are considered for the minimum computation. Non-numeric columns are excluded from the results.

Conclusion

In conclusion, the Pandas DataFrame min() method is a powerful and flexible tool for identifying the minimum values in your dataset. It plays a crucial role in data analysis by providing insights into the smallest values within specified axes of a DataFrame.

Happy Learning!!

Reference

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.median.html