In Polars, you can sum multiple columns either row-wise or column-wise using the sum()
function along with the select()
or with_columns()
method, depending on your requirements. In this article, I will explain summing multiple columns in Polars.
Key Points –
- You can sum multiple columns row-wise using the
+
operator(df.with_columns((pl.col("A") + pl.col("B")).alias("Total")))
. - The
sum_horizontal()
function allows summing all numeric columns dynamically without specifying each column manually. - To sum specific columns downward, use
df.select(pl.col(["A", "B"]).sum())
. df.select(pl.all().sum())
computes the sum of all columns, including non-numeric ones if not excluded.- Always use
alias("New Column Name")
when adding sum results to a DataFrame to keep things organized. pl.sum_horizontal()
is more efficient for row-wise operations than usingpl.concat_list()
.
Usage of Polars Sum Multiple Columns
The sum operation in Polars is used to compute the total of multiple columns either row-wise (horizontally) or column-wise (vertically).
To run some examples of the Polars sum multiple columns, let’s create a Polars DataFrame.
import polars as pl
studentdetails = {
"Studentname":["Ram", "Sam", "Scott", "Ann", "John"],
"Mathematics" :[80,90,85,72,95],
"Science" :[85,95,80,90,92],
"English" :[90,85,80,75,95]
}
df = pl.DataFrame(studentdetails)
print("Original DataFrame:\n", df)
Yields below output.
You can sum multiple columns row-wise (for each row) using the +
operator inside with_columns(). This method is explicit and useful when you want to sum specific columns.
# Sum the subject marks row-wise using the + operator
df2 = df.with_columns((pl.col("Mathematics") + pl.col("Science") + pl.col("English")).alias("Total Marks"))
print(df2)
Here,
- The
+
operator provides an easy-to-read method for summing columns row-wise. - We manually add each column (
Mathematics
,Science
, andEnglish
) using+
. - The result is stored in a new column “
Total Marks
“. - The sum is calculated row-wise for each student.
Sum Columns Row-wise Using pl.concat_list() Function
The pl.concat_list()
function in Polars allows us to concatenate multiple column values into a list and then apply functions like sum() to compute the row-wise sum dynamically.
# Sum multiple columns row-wise using pl.concat_list()
df2 = df.with_columns(pl.concat_list(["Mathematics", "Science", "English"]).list.sum().alias("Total Marks"))
print(df2)
Here,
pl.concat_list(["Mathematics", "Science", "English"])
Combines selected columns into a list..list.sum()
Computes the sum for each row..alias("Total Score")
Renames the result as"Total Marks"
.
Yields same output as above.
Sum All Numeric Columns Row-wise
When summing all numeric columns row-wise in Polars, we can use the sum_horizontal()
function, which dynamically sums all numeric columns without needing to specify them manually.
# Sum all numeric columns row-wise
df2 = df.with_columns(pl.sum_horizontal(pl.all().exclude("Studentname")).alias("Total Marks"))
print(df2)
# Sum all numeric columns row-wise
df2 = df.with_columns(pl.sum_horizontal(pl.exclude("Studentname")).alias("Total Marks"))
print(df2)
Here,
pl.exclude("Studentname")
Excludes non-numeric columns.pl.sum_horizontal()
Sums all remaining numeric columns row-wise..alias("Total Score")
Renames the result as"Total Marks"
.
Yields same output as above.
Sum Specific Columns Column-Wise
To sum specific columns column-wise (i.e., computing the sum for each selected column) in Polars, you can use the select()
method along with pl.col().sum()
function.
# Sum specific columns column-wise
df2 = df.select(pl.col(["Mathematics", "Science"]).sum())
print(df2)
# Output:
# shape: (1, 2)
┌─────────────┬─────────┐
│ Mathematics ┆ Science │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════════════╪═════════╡
│ 422 ┆ 442 │
└─────────────┴─────────┘
Here,
pl.col(["Mathematics", "Science"]).sum()
Selects only Mathematics and Science, then computes their column-wise sum.select()
Extracts only the computed sum values.- The output is a new single-row DataFrame with summed values.
Sum Multiple Columns Column-Wise
To sum multiple columns column-wise (i.e., computing the sum for each column individually) in Polars, you can use select(pl.all().sum())
or sum()
on selected columns.
# Sum multiple columns downward (column-wise)
df2 = df.select(pl.col(["Mathematics", "Science", "English"]).sum())
print(df2)
# Sum multiple columns column-wise
df2 = df.select(pl.all().exclude("Studentname").sum())
print(df2)
# Output:
# shape: (1, 3)
┌─────────────┬─────────┬─────────┐
│ Mathematics ┆ Science ┆ English │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════════════╪═════════╪═════════╡
│ 422 ┆ 442 ┆ 425 │
└─────────────┴─────────┴─────────┘
Here,
pl.col(["Mathematics", "Science", "English"]).sum()
selects the specified columns and calculates the column-wise sum.select()
returns a single-row DataFrame where each column holds its sum.
Conclusion
In conclusion, polars provides efficient and flexible methods to sum multiple columns, both row-wise and column-wise. Using functions like the +
operator, sum_horizontal()
, pl.col().sum()
, pl.concat_list()
, and pl.exclude()
, you can dynamically compute sums without the need to manually specify column names.
Happy Learning!!
Related Articles
- Polars Cast Multiple Columns
- Polars DataFrame row() Usage & Examples
- Polars DataFrame count() Function
- Polars DataFrame limit() Method
- Polars DataFrame median() Usage & Examples
- Polars DataFrame std() – Usage & Examples
- Polars.DataFrame.mean() – Explained by Examples
- Polars DataFrame partition_by() Usage & Examples