In Polars, the max()
method is used to compute the maximum value of a column or expression in a DataFrame or Series. It returns a new DataFrame where each column holds the highest value found in the corresponding column of the original DataFrame. By default, DataFrame.max()
ignores missing values (null or NaN) during computation, ensuring that the maximum value is determined from the available data.
In this article, I will explain the Polars DataFrame max()
function, including its syntax, parameters, and usage, along with how to generate a new DataFrame containing the maximum values for each column.
Key Points –
- The
max()
method computes the maximum value for each column in a Polars DataFrame. - Unlike some libraries that return a Series or scalar, Polars returns a new DataFrame with the max values.
- Works with numeric, string, and datetime columns, returning the highest value in each column.
- For string columns,
max()
returns the highest value based on lexicographical (dictionary) order. - Supports floating-point numbers, preserving their precision while computing the maximum.
- Allows finding the maximum value within groups when used with
group_by().agg(pl.col().max())
. - Can be used inside
select()
for column-wise max calculations orwith_columns()
to add new computed columns. - The function
pl.max("col1", "col2")
can be used to compute row-wise maximum values across multiple columns.
Polars DataFrame.max() Introduction
Following is a syntax of the DataFrame.max()
. This function takes *names
params.
# Syntax of max()
polars.max(*names: str) → Expr
Parameters of the Polars DataFrame.max()
It allows only one parameter.
*names
– One or more column names as strings or aggregation.
Return Value
This function returns a new DataFrame with the maximum values for each column.
Usage of Polars DataFrame max() Method
The max()
method in Polars is used to compute the maximum value for each column in a DataFrame. It efficiently finds the highest value in numeric, date, and categorical columns.
To run some examples of the Polars DataFrame max() method, let’s create a Polars DataFrame.
import polars as pl
# Creating a sample DataFrame
data = {
'A': [5, 14, 9, 18, 25],
'B': [52, 8, 36, 15, 42],
'C': [7, 11, 23, 59, 84]
}
df = pl.DataFrame(data)
print("Original DataFrame:\n",df)
Yields below output.
To get the maximum value for each column in a Polars DataFrame, you can use the .max()
method. When applied to a DataFrame without any parameters, this function returns the maximum values for each column.
# Finding the maximum value for each column
df_max = df.max()
print("Maximum values for each column:\n", df_max)
Here,
.max()
computes the maximum value for each column in the DataFrame.- It returns a new DataFrame with a single row containing the max values.
Get Maximum Value for Specific Columns
To get the maximum value for specific columns in a Polars DataFrame, you can use the .select() method with the .max()
function.
# Finding the maximum value for specific columns (A and C)
df_max = df.select([
pl.col("A").max().alias("Max_A"),
pl.col("C").max().alias("Max_C")])
print("Maximum values for selected columns:\n", df_max)
# Get the maximum value for specific columns ('A' and 'B')
max_values = df.select(pl.col(["A", "C"]).max())
print("Maximum values for selected columns:\n", max_values)
# Output:
# Maximum values for selected columns:
# shape: (1, 2)
┌───────┬───────┐
│ Max_A ┆ Max_C │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═══════╪═══════╡
│ 25 ┆ 84 │
└───────┴───────┘
Here,
.select()
allows selecting specific columns for computation.pl.col("A").max()
calculates the maximum value for column"A"
.alias("Max_A")
renames the result for better readability.
Maximum Value of a Single Column
To get the maximum value of a single column in a Polars DataFrame, you can use the .select()
method with .max()
, or use the square bracket notation to access a single column.
# Finding the maximum value of column "A"
df_max = df.select(pl.col("A").max()).item()
print("Maximum value in column A:", df_max)
# Using square bracket notation
df_max = df["A"].max()
print("Maximum value in column A:", df_max)
# Output:
# Maximum value in column A: 25
Here,
pl.col("A").max()
computes the max value of column"A"
..select()
returns a DataFrame, so.item()
extracts the scalar value.- Using
df["A"].max()
is a more direct way to get the max value of a single column.
Get Maximum for Each Row
Use .max_horizontal()
to compute the row-wise maximum and find the highest value in each row of a Polars DataFrame.
# Compute the maximum value for each row
max_df = df.with_columns(pl.max_horizontal("A", "B", "C").alias("Row_Max"))
print("DataFrame with row-wise maximum values:\n", max_df)
# Compute the maximum value for each row
max_df = df.with_columns(pl.concat_list(pl.all()).list.max().alias("Row_Max"))
print("DataFrame with row-wise maximum values:\n", max_df)
# Output:
# DataFrame with row-wise maximum values:
# shape: (5, 4)
┌─────┬─────┬─────┬─────────┐
│ A ┆ B ┆ C ┆ Row_Max │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════════╡
│ 5 ┆ 52 ┆ 7 ┆ 52 │
│ 14 ┆ 8 ┆ 11 ┆ 14 │
│ 9 ┆ 36 ┆ 23 ┆ 36 │
│ 18 ┆ 15 ┆ 59 ┆ 59 │
│ 25 ┆ 42 ┆ 84 ┆ 84 │
└─────┴─────┴─────┴─────────┘
Here,
pl.concat_list(pl.all())
creates a list of all column values for each row..list.max()
computes the maximum value for each row..with_columns(… .alias("Row_Max"))
adds the result as a new column.
Using max() with Filtering
To use .max()
with filtering in Polars, you can first apply a filter using .filter()
and then compute the maximum value for the desired columns.
# Filter rows where 'B' > 10, then get the maximum value of 'A'
max_df = df.filter(pl.col("B") > 10).select(pl.col("A").max())
print("Maximum value of 'A' after filtering where B > 10:\n", max_df)
# Output:
# Maximum value of 'A' after filtering where B > 10:
# shape: (1, 1)
┌─────┐
│ A │
│ --- │
│ i64 │
╞═════╡
│ 25 │
└─────┘
Here,
df.filter(pl.col("B") > 10)
– Filters rows whereB > 10
..select(pl.col("A").max())
– Computes the maximum value forA
after filtering.
To get the maximum value for multiple columns after applying a filter, you can use .filter()
followed by select(pl.col(["col1", "col2"]).max())
.
# Filter rows where 'B' > 10, then get the maximum value for 'A' and 'C'
max_df = df.filter(pl.col("B") > 30).select([pl.col("A").max(),pl.col("C").max()])
print("Maximum values for 'A' and 'C' after filtering where B > 30:\n", max_df)
# Output:
# Maximum values for 'A' and 'C' after filtering where B > 30:
# shape: (1, 2)
┌─────┬─────┐
│ A ┆ C │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 25 ┆ 84 │
└─────┴─────┘
Here,
df.filter(pl.col("B") > 30)
selects rows whereB>30
.select([pl.col("A").max(), pl.col("C").max()])
calculates the maximum values of A and C from the filtered data.
Handling Missing Values with max()
When dealing with missing (null) values in Polars, the .max()
function automatically ignores them by default. However, you can explicitly handle them using .fill_null()
if needed.
import polars as pl
# Creating a DataFrame with missing values (nulls)
data = {
'A': [5, None, 9, 18, 25],
'B': [52, 8, None, 15, 42],
'C': [None, 11, 23, 59, 84]
}
df = pl.DataFrame(data)
# Get max values for each column (Polars automatically ignores nulls)
max_df = df.select(pl.all().max())
print("Maximum values for each column (ignoring nulls):\n", max_df)
# Output:
# Maximum values for each column (ignoring nulls):
# shape: (1, 3)
┌─────┬─────┬─────┐
│ A ┆ B ┆ C │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 25 ┆ 52 ┆ 84 │
└─────┴─────┴─────┘
If you want to replace nulls before computing the max, you can use .fill_null()
.
# Replace nulls with a default value (e.g., 0) before finding max
max_df = df.fill_null(0).select(pl.all().max())
print("Maximum values after filling nulls with 0:\n", max_df)
# Output:
# Maximum values after filling nulls with 0:
# shape: (1, 3)
┌─────┬─────┬─────┐
│ A ┆ B ┆ C │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 25 ┆ 52 ┆ 84 │
└─────┴─────┴─────┘
Find Maximum Value of Float Columns
To find the maximum value specifically for float columns in a Polars DataFrame, you can use column selection with .select()
followed by .max()
method.
import polars as pl
# Creating a sample DataFrame with integer and float columns
data = {
'A': [5.2, 14.7, 9.3, 18.1, 25.5],
'B': [52, 8, 36, 15, 42],
'C': [7.8, 11.2, 23.5, 59.6, 84.3]
}
df = pl.DataFrame(data)
# Select only float columns and compute max values
max_df = df.select(pl.col(pl.Float64).max())
print("Maximum values for float columns:\n", max_df)
# Output:
# Maximum values for float columns:
# shape: (1, 2)
┌──────┬──────┐
│ A ┆ C │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞══════╪══════╡
│ 25.5 ┆ 84.3 │
└──────┴──────┘
Here,
pl.col(pl.Float64)
selects only float columns.max()
computes the maximum value for each selected float column.
Conclusion
In this article, I will explain the Polars DataFrame.max()
method and by using its syntax, parameters, and usage how we can return maximum values based on different contexts, such as for each column, row, or specific group.
Happy Learning!!
Related Articles
- Polars DataFrame drop() Method
- Polars DataFrame.sort() Method
- Polars DataFrame.melt() Method
- Polars DataFrame.unique() Function
- Polars DataFrame.explode() Method
- Polars DataFrame head() Function
- Polars DataFrame shift() Usage & Examples
- Polars DataFrame schema() Usage & Examples
- Polars DataFrame.cast() Method with Examples