• Post author:
  • Post category:Polars
  • Post last modified:February 14, 2025
  • Reading time:15 mins read
You are currently viewing Polars DataFrame max() Method

In Polars, the max() method is used to compute the maximum value of a column or expression in a DataFrame or Series. It returns a new DataFrame where each column holds the highest value found in the corresponding column of the original DataFrame. By default, DataFrame.max() ignores missing values (null or NaN) during computation, ensuring that the maximum value is determined from the available data.

Advertisements

In this article, I will explain the Polars DataFrame max() function, including its syntax, parameters, and usage, along with how to generate a new DataFrame containing the maximum values for each column.

Key Points –

  • The max() method computes the maximum value for each column in a Polars DataFrame.
  • Unlike some libraries that return a Series or scalar, Polars returns a new DataFrame with the max values.
  • Works with numeric, string, and datetime columns, returning the highest value in each column.
  • For string columns, max() returns the highest value based on lexicographical (dictionary) order.
  • Supports floating-point numbers, preserving their precision while computing the maximum.
  • Allows finding the maximum value within groups when used with group_by().agg(pl.col().max()).
  • Can be used inside select() for column-wise max calculations or with_columns() to add new computed columns.
  • The function pl.max("col1", "col2") can be used to compute row-wise maximum values across multiple columns.

Polars DataFrame.max() Introduction

Following is a syntax of the DataFrame.max(). This function takes *names params.


# Syntax of max()
polars.max(*names: str) → Expr

Parameters of the Polars DataFrame.max()

It allows only one parameter.

  • *names – One or more column names as strings or aggregation.

Return Value

This function returns a new DataFrame with the maximum values for each column.

Usage of Polars DataFrame max() Method

The max() method in Polars is used to compute the maximum value for each column in a DataFrame. It efficiently finds the highest value in numeric, date, and categorical columns.

To run some examples of the Polars DataFrame max() method, let’s create a Polars DataFrame.


import polars as pl

# Creating a sample DataFrame
data = {
    'A': [5, 14, 9, 18, 25],
    'B': [52, 8, 36, 15, 42],
    'C': [7, 11, 23, 59, 84]
}

df = pl.DataFrame(data)
print("Original DataFrame:\n",df)

Yields below output.

polars max

To get the maximum value for each column in a Polars DataFrame, you can use the .max() method. When applied to a DataFrame without any parameters, this function returns the maximum values for each column.


# Finding the maximum value for each column
df_max = df.max()
print("Maximum values for each column:\n", df_max)

Here,

  • .max() computes the maximum value for each column in the DataFrame.
  • It returns a new DataFrame with a single row containing the max values.
polars max

Get Maximum Value for Specific Columns

To get the maximum value for specific columns in a Polars DataFrame, you can use the .select() method with the .max() function.


# Finding the maximum value for specific columns (A and C)
df_max = df.select([
    pl.col("A").max().alias("Max_A"),
    pl.col("C").max().alias("Max_C")])
print("Maximum values for selected columns:\n", df_max)

# Get the maximum value for specific columns ('A' and 'B')
max_values = df.select(pl.col(["A", "C"]).max())
print("Maximum values for selected columns:\n", max_values)

# Output:
# Maximum values for selected columns:
# shape: (1, 2)
┌───────┬───────┐
│ Max_A ┆ Max_C │
│ ---   ┆ ---   │
│ i64   ┆ i64   │
╞═══════╪═══════╡
│ 25    ┆ 84    │
└───────┴───────┘

Here,

  • .select() allows selecting specific columns for computation.
  • pl.col("A").max() calculates the maximum value for column "A".
  • alias("Max_A") renames the result for better readability.

Maximum Value of a Single Column

To get the maximum value of a single column in a Polars DataFrame, you can use the .select() method with .max(), or use the square bracket notation to access a single column.


# Finding the maximum value of column "A"
df_max = df.select(pl.col("A").max()).item()
print("Maximum value in column A:", df_max)

# Using square bracket notation
df_max = df["A"].max()
print("Maximum value in column A:", df_max)

# Output:
# Maximum value in column A: 25

Here,

  • pl.col("A").max() computes the max value of column "A".
  • .select() returns a DataFrame, so .item() extracts the scalar value.
  • Using df["A"].max() is a more direct way to get the max value of a single column.

Get Maximum for Each Row

Use .max_horizontal() to compute the row-wise maximum and find the highest value in each row of a Polars DataFrame.


# Compute the maximum value for each row
max_df = df.with_columns(pl.max_horizontal("A", "B", "C").alias("Row_Max"))
print("DataFrame with row-wise maximum values:\n", max_df)

# Compute the maximum value for each row
max_df = df.with_columns(pl.concat_list(pl.all()).list.max().alias("Row_Max"))
print("DataFrame with row-wise maximum values:\n", max_df)

# Output:
# DataFrame with row-wise maximum values:
# shape: (5, 4)
┌─────┬─────┬─────┬─────────┐
│ A   ┆ B   ┆ C   ┆ Row_Max │
│ --- ┆ --- ┆ --- ┆ ---     │
│ i64 ┆ i64 ┆ i64 ┆ i64     │
╞═════╪═════╪═════╪═════════╡
│ 5   ┆ 52  ┆ 7   ┆ 52      │
│ 14  ┆ 8   ┆ 11  ┆ 14      │
│ 9   ┆ 36  ┆ 23  ┆ 36      │
│ 18  ┆ 15  ┆ 59  ┆ 59      │
│ 25  ┆ 42  ┆ 84  ┆ 84      │
└─────┴─────┴─────┴─────────┘

Here,

  • pl.concat_list(pl.all()) creates a list of all column values for each row.
  • .list.max() computes the maximum value for each row.
  • .with_columns(… .alias("Row_Max")) adds the result as a new column.

Using max() with Filtering

To use .max() with filtering in Polars, you can first apply a filter using .filter() and then compute the maximum value for the desired columns.


# Filter rows where 'B' > 10, then get the maximum value of 'A'
max_df = df.filter(pl.col("B") > 10).select(pl.col("A").max())
print("Maximum value of 'A' after filtering where B > 10:\n", max_df)

# Output:
# Maximum value of 'A' after filtering where B > 10:
# shape: (1, 1)
┌─────┐
│ A   │
│ --- │
│ i64 │
╞═════╡
│ 25  │
└─────┘

Here,

  • df.filter(pl.col("B") > 10) – Filters rows where B > 10.
  • .select(pl.col("A").max()) – Computes the maximum value for A after filtering.

To get the maximum value for multiple columns after applying a filter, you can use .filter() followed by select(pl.col(["col1", "col2"]).max()).


# Filter rows where 'B' > 10, then get the maximum value for 'A' and 'C'
max_df = df.filter(pl.col("B") > 30).select([pl.col("A").max(),pl.col("C").max()])
print("Maximum values for 'A' and 'C' after filtering where B > 30:\n", max_df)

# Output:
# Maximum values for 'A' and 'C' after filtering where B > 30:
# shape: (1, 2)
┌─────┬─────┐
│ A   ┆ C   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 25  ┆ 84  │
└─────┴─────┘

Here,

  • df.filter(pl.col("B") > 30) selects rows where B>30.
  • select([pl.col("A").max(), pl.col("C").max()]) calculates the maximum values of A and C from the filtered data.

Handling Missing Values with max()

When dealing with missing (null) values in Polars, the .max() function automatically ignores them by default. However, you can explicitly handle them using .fill_null() if needed.


import polars as pl

# Creating a DataFrame with missing values (nulls)
data = {
    'A': [5, None, 9, 18, 25],
    'B': [52, 8, None, 15, 42],
    'C': [None, 11, 23, 59, 84]
}

df = pl.DataFrame(data)

# Get max values for each column (Polars automatically ignores nulls)
max_df = df.select(pl.all().max())
print("Maximum values for each column (ignoring nulls):\n", max_df)

# Output:
# Maximum values for each column (ignoring nulls):
# shape: (1, 3)
┌─────┬─────┬─────┐
│ A   ┆ B   ┆ C   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 25  ┆ 52  ┆ 84  │
└─────┴─────┴─────┘

If you want to replace nulls before computing the max, you can use .fill_null().


# Replace nulls with a default value (e.g., 0) before finding max
max_df = df.fill_null(0).select(pl.all().max())
print("Maximum values after filling nulls with 0:\n", max_df)

# Output:
# Maximum values after filling nulls with 0:
# shape: (1, 3)
┌─────┬─────┬─────┐
│ A   ┆ B   ┆ C   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 25  ┆ 52  ┆ 84  │
└─────┴─────┴─────┘

Find Maximum Value of Float Columns

To find the maximum value specifically for float columns in a Polars DataFrame, you can use column selection with .select() followed by .max() method.


import polars as pl

# Creating a sample DataFrame with integer and float columns
data = {
    'A': [5.2, 14.7, 9.3, 18.1, 25.5],  
    'B': [52, 8, 36, 15, 42],  
    'C': [7.8, 11.2, 23.5, 59.6, 84.3]  
}

df = pl.DataFrame(data)

# Select only float columns and compute max values
max_df = df.select(pl.col(pl.Float64).max())
print("Maximum values for float columns:\n", max_df)

# Output:
# Maximum values for float columns:
# shape: (1, 2)
┌──────┬──────┐
│ A    ┆ C    │
│ ---  ┆ ---  │
│ f64  ┆ f64  │
╞══════╪══════╡
│ 25.5 ┆ 84.3 │
└──────┴──────┘

Here,

  • pl.col(pl.Float64) selects only float columns.
  • max() computes the maximum value for each selected float column.

Conclusion

In this article, I will explain the Polars DataFrame.max() method and by using its syntax, parameters, and usage how we can return maximum values based on different contexts, such as for each column, row, or specific group.

Happy Learning!!

References