• Post author:
  • Post category:Pandas
  • Post last modified:August 28, 2024
  • Reading time:16 mins read
You are currently viewing Pandas DataFrame min() Method

In Pandas, the min() method is used to find the minimum value over the specified axis. By default, it operates column-wise, meaning it returns the minimum value for each column. You can also use it to find the minimum value across rows by changing the axis.

Advertisements

In this article, I will explain the Pandas DataFrame min() method by using its syntax, parameters, and usage to return the smallest value among in the given data along the specified axis.

Key Points –

  • The min() method returns the minimum value(s) from the DataFrame, either across a specified axis or for each column by default.
  • The method allows for specifying the axis (axis=0 for columns and axis=1 for rows) over which to find the minimum values.
  • By default, the method ignores NaN values when computing the minimum. This behavior can be adjusted using the skipna parameter.
  • The return type of the min() method is a Series object containing the minimum values along the specified axis.
  • The level parameter allows the computation of the minimum to be performed on a particular level of a MultiIndex (hierarchical index), returning a Series with the minimum values for each group.

Pandas DataFrame min() Introduction

Let’s know the syntax of the min() method.


# Syntax of DataFrame min()
DataFrame.min(axis=0, skipna=True, level=None, numeric_only=None, **kwargs)

Parameters of the Pandas DataFrame min()

Following are the parameters of the DataFrame min() method

  • axis – {0 or ‘index’, 1 or ‘columns’}, default 0
    • 0 or index: Compute the minimum for each column.
    • 1 or columns: Compute the minimum for each row.
  • skipna – bool, default True.
    • Exclude NA/null values when computing the result.
  • level – int or level name, default None.
    • If the axis is a MultiIndex (hierarchical), compute the minimum along a particular level, collapsing into a Series.
  • numeric_only – bool, default None.
    • Include only float, int, and boolean data. If None, it will attempt to use everything and then only numeric data. This parameter is available starting from version 1.5.0.
  • **kwargs – Additional arguments for compatibility; not used in the current context.

Return Value

It returns the minimum of the values over the requested axis.

Usage of Pandas DataFrame min() Method

The Pandas DataFrame min() method is used to compute the minimum value across a specified axis (either rows or columns) of a DataFrame. This method is particularly useful for quickly finding the smallest values in each column or row, especially when working with large datasets.

Now, let’s create a Pandas DataFrame using data from a dictionary.


import pandas as pd
import numpy as np

# Creating a sample DataFrame
data = {
    'A': [4, 38, 8, 24, 15],
    'B': [52, 5, 49, 18, 31],
    'C': [9, 13, 84, 53, 22]
}

df = pd.DataFrame(data)
print("Original DataFrame:\n",df)

Yields below output.

pandas min

Finding the Minimum Value in Each Column

Alternatively, to find the minimum value in each column of a DataFrame, you can use the min() method without specifying an axis. By default, this method computes the minimum value across each column.


# Finding the minimum value in each column
df2 = df.min()
print("Minimum value in each column:\n", df2)

In this example, the minimum value found in each column (A, B, and C) of the DataFrame.

pandas min

Finding the Minimum Value in Each Row

To find the minimum value in each row of a DataFrame, you can use the min() method with the axis=1 parameter. This will compute the minimum value across all columns for each row.


# Finding the minimum value in each row
df2 = df.min(axis=1)
print("Minimum value in each row:\n", df2)

# Output:
# Minimum value in each row:
# 0     4
# 1     5
# 2     8
# 3    18
# 4    15
# dtype: int64

In this output, the minimum value for each row in the DataFrame, with each value representing the smallest number found across all columns for that particular row.

Handling Missing Values (NaN)

When working with DataFrames that contain missing values (NaN), the min() method by default ignores these missing values when calculating the minimum. However, you can control this behavior using the skipna parameter.

Using skipna=True (which is the default behavior), the min() method will ignore any NaN values in the DataFrame when calculating the minimum value.


import pandas as pd
import numpy as np

# Creating a DataFrame with missing values
data = {
    'A': [4, np.nan, 8, 24, 15],
    'B': [52, 5, np.nan, 18, 31],
    'C': [9, 13, 84, np.nan, 22]
}

df = pd.DataFrame(data)

# Finding the minimum value in each column, 
# Ignoring NaN values by default
df2 = df.min(skipna=True)
print("Minimum value in each column (ignoring NaN):\n", df2)

# Output:
# Minimum value in each column (ignoring NaN):
# A    4.0
# B    5.0
# C    9.0
# dtype: float64

Similarly, if you want to calculate the minimum value in each column or row of a DataFrame and include NaN values in the computation (meaning that if any NaN is present, the result will be NaN), you can set the skipna parameter to False.


# Finding the minimum value in each column, 
# Including NaN values
df2 = df.min(skipna=False)
print("Minimum value in each column (including NaN):\n", df2)

# Output:
# Minimum value in each column (including NaN):
# A   NaN
# B   NaN
# C   NaN
# dtype: float64

Finding Minimum Values in Numeric Columns Only

Finally, to find the minimum values in numeric columns only, you can use the min() method in conjunction with the numeric_only=True parameter. This ensures that only numeric data types (integers, floats, and booleans) are considered when computing the minimum values, excluding any non-numeric columns.


import pandas as pd
import numpy as np

# Creating a DataFrame with mixed data types
data_mixed = {
    'A': [4, 38, 8, 24, 15],
    'B': ['Spark', 'C++', 'Pandas', 'Java', 'Python'],
    'C': [9.5, 13.1, 84.2, 53.4, 22.8]
}

df = pd.DataFrame(data_mixed)
print("DataFrame with mixed data types:\n", df)

# Finding the minimum values considering only numeric data types
df2 = df.min(numeric_only=True)
print("Minimum value in numeric columns only:\n", df2)

# Output:
# DataFrame with mixed data types:
#     A       B     C
# 0   4   Spark   9.5
# 1  38     C++  13.1
# 2   8  Pandas  84.2
# 3  24    Java  53.4
# 4  15  Python  22.8
# Minimum value in numeric columns only:
# A    4.0
# C    9.5
# dtype: float64

FAQ on Pandas DataFrame min() Method

What does the min() method do?

The min() method returns the minimum value along a specified axis of a DataFrame. By default, it computes the minimum value for each column (i.e., the minimum of all rows for each column).

How do I find the minimum value in each column?

To find the minimum value in each column, use the min() method without specifying the axis parameter (default behavior).

What happens if there are missing values (NaN) in the DataFrame?

By default, min() ignores NaN values when computing the minimum. If you want to include NaN values in the calculation, which will result in NaN if any are present, use skipna=False.

How can I handle NaN values differently?

The skipna parameter controls whether NaN values are ignored (skipna=True, default) or included (skipna=False).

How does numeric_only parameter affect the result?

The numeric_only parameter ensures that only numeric columns are considered for the minimum computation. Non-numeric columns are excluded from the results.

Conclusion

In conclusion, the Pandas DataFrame min() method is a powerful and flexible tool for identifying the minimum values in your dataset. It plays a crucial role in data analysis by providing insights into the smallest values within specified axes of a DataFrame.

Happy Learning!!

Reference