• Post author:
  • Post category:Polars
  • Post last modified:May 19, 2025
  • Reading time:11 mins read
You are currently viewing Add Row of Column Totals in Polars

In Polars, you can add a row of column totals, a summary row containing the sum of each column, at the bottom of a DataFrame. Adding a row of column totals in Polars means creating a new row at the bottom of your DataFrame where each cell contains the sum (or total) of the values from the respective column above it. In this article, I will explain how the add a row of column totals in polars.

Advertisements

Key Points –

  • Adds a single summary row containing the sum of each numeric column.
  • Keeps original rows intact; the totals row is appended to the bottom via vstack or concat
  • Use df.select(pl.col(pl.Numeric).sum()) to calculate column totals.
  • Append the totals row with df.vstack(totals_row) or pl.concat([df, totals_row]).
  • Non-numeric columns are automatically excluded from the aggregation and will return null in the totals row if later concatenated.
  • Wrap the sums in with_columns(pl.lit("Total").alias("label")) (or similar) to tag the new row for easy identification.

Usage of Add Row of Column Totals in Polars

Adding a row of column totals in Polars means creating and appending a new row at the bottom of a DataFrame where each numeric column contains the sum of all its values from the existing rows. This totals row provides a quick summary of the data by showing the aggregate (usually the sum) for each column, helping to better understand overall metrics or totals within the dataset.

First, let’s create a Polars DataFrame.


import polars as pl

# Creating a sample DataFrame
data = {
    'A': [2, 4, 6, 8],
    'B': [3, 5, 7, 9],
    'C': [5, 10, 15, 20]
}

df = pl.DataFrame(data)
print("Original DataFrame:\n", df)

Yields below output.

To append a row containing the column sums to your original Polars DataFrame df, concatenate the totals row with df.


# Calculate sum of each column as a single-row DataFrame
total_row = df.select(pl.all().sum())

# Concatenate the original DataFrame with the sum row
result = pl.concat([df, total_row])
print("DataFrame with sum row concatenated:\n", result)

Here,

  • df.select(pl.all().sum()) creates a DataFrame with one row containing column sums.
  • pl.concat([df, sum_row]) stacks them vertically, adding the sum row at the bottom.

Add a Label Column for the Total Row

If you want to add a label column only to the total (sum) row while keeping the original rows unchanged (without a label column), you can achieve this specifically for the totals row in Polars.


# Add empty label column to original DataFrame rows
df_labeled = df.with_columns(pl.lit("").alias("Label"))

# Create totals row with the label "Total"
total_row = df.select(pl.all().sum()).with_columns(pl.lit("Total").alias("Label"))

# Append the totals row to the original DataFrame
result = df_labeled.vstack(total_row)
print(result)

# Output:
# shape: (5, 4)
┌─────┬─────┬─────┬───────┐
│ A   ┆ B   ┆ C   ┆ Label │
│ --- ┆ --- ┆ --- ┆ ---   │
│ i64 ┆ i64 ┆ i64 ┆ str   │
╞═════╪═════╪═════╪═══════╡
│ 2   ┆ 3   ┆ 5   ┆       │
│ 4   ┆ 5   ┆ 10  ┆       │
│ 6   ┆ 7   ┆ 15  ┆       │
│ 8   ┆ 9   ┆ 20  ┆       │
│ 20  ┆ 24  ┆ 50  ┆ Total │
└─────┴─────┴─────┴───────┘

Adding Totals Row After Selecting Specific Columns

If you want to add a totals row after selecting specific columns in your Polars DataFrame, for instance, selecting only columns A and C, you can then append a totals row with a label for those selected columns.


# Select only columns A and C
selected_df = df.select(["A", "C"])

# Compute totals for selected columns
totals = selected_df.select(pl.all().sum()).with_columns(pl.lit("Total").alias("label"))
result = pl.concat([selected_df.with_columns(pl.lit(None).cast(pl.Utf8).alias("label")), totals])
print(result)

# Output:
# shape: (5, 3)
┌─────┬─────┬───────┐
│ A   ┆ C   ┆ label │
│ --- ┆ --- ┆ ---   │
│ i64 ┆ i64 ┆ str   │
╞═════╪═════╪═══════╡
│ 2   ┆ 5   ┆ null  │
│ 4   ┆ 10  ┆ null  │
│ 6   ┆ 15  ┆ null  │
│ 8   ┆ 20  ┆ null  │
│ 20  ┆ 50  ┆ Total │
└─────┴─────┴───────┘

Here,

  • First, you select columns A and C.
  • Then compute sums only for these selected columns.
  • Add a "label" column with "Total" for the totals row, and null for data rows. Concatenate vertically.

Add Totals Row and Add a Row Number Index

To add a totals row to your Polars DataFrame and include a row number index column that counts all rows (including the totals row), you can first append the totals row and then add the row number index column.


# Compute totals row with label column
totals = (df.select(pl.all().sum()).with_columns(pl.lit("Total").alias("label")))

# Add label column with null for original rows
df_labeled = df.with_columns(pl.lit(None).cast(pl.Utf8).alias("label"))

# Concatenate original + totals row
df_with_totals = pl.concat([df_labeled, totals])

# Add row number index starting at 1
result = df_with_totals.with_row_count("row_num", offset=1)
print(result)

# Output:
# shape: (5, 5)
┌─────────┬─────┬─────┬─────┬───────┐
│ row_num ┆ A   ┆ B   ┆ C   ┆ label │
│ ---     ┆ --- ┆ --- ┆ --- ┆ ---   │
│ u32     ┆ i64 ┆ i64 ┆ i64 ┆ str   │
╞═════════╪═════╪═════╪═════╪═══════╡
│ 1       ┆ 2   ┆ 3   ┆ 5   ┆ null  │
│ 2       ┆ 4   ┆ 5   ┆ 10  ┆ null  │
│ 3       ┆ 6   ┆ 7   ┆ 15  ┆ null  │
│ 4       ┆ 8   ┆ 9   ┆ 20  ┆ null  │
│ 5       ┆ 20  ┆ 24  ┆ 50  ┆ Total │
└─────────┴─────┴─────┴─────┴───────┘

Here,

  • with_row_count("row_num", offset=1) adds a 1-based row index.
  • Original rows have null in label, totals row is labeled "Total".
  • The totals row is appended at the end.

Sum only Numeric Columns, Ignore Non-Numeric

To sum only the numeric columns while excluding non-numeric ones, you can filter columns by their data type before performing the sum. Using select with a type filter allows you to sum just the numeric columns and ignore others, like strings.


import polars as pl

df = pl.DataFrame({
    "A": [2, 4, 6, 8],
    "B": [3, 5, 7, 9],
    "C": [5, 10, 15, 20],
    "D": ["x", "y", "z", "w"]  # non-numeric column
})

# Select only numeric columns and sum them
totals = df.select(
    [pl.col(col).sum() for col, dtype in zip(df.columns, df.dtypes) if dtype in [pl.Int64, pl.Float64]]
)
print(totals)

# Output:
# shape: (1, 3)
┌─────┬─────┬─────┐
│ A   ┆ B   ┆ C   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 20  ┆ 24  ┆ 50  │
└─────┴─────┴─────┘

Here,

  • df.dtypes lists the data types of all columns.
  • We filter columns where dtype is integer or float.
  • Only those columns are summed.

Conclusion

In conclusion, Polars provides flexible and efficient ways to work with DataFrames, whether you want to add a totals row, handle numeric and non-numeric columns separately, or enhance your data with custom labels and indexing. By filtering columns by type and using methods like select, sum, and vstack, you can easily create summarized views that fit your analysis needs.

Happy Learning!!

Reference