• Post author:
  • Post category:Polars
  • Post last modified:May 20, 2025
  • Reading time:12 mins read
You are currently viewing Polars DataFrame with_columns() Function with Examples

In Polars, the with_columns() function is used to add new columns, modify existing ones, or transform columns within a DataFrame. It provides a fast, vectorized way to apply multiple column operations simultaneously. You can pass either a list of expressions or keyword arguments, where each expression defines how to compute the new or updated columns based on existing data or constant values.

Advertisements

In this article, I will explain the Polars DataFrame with_columns() function, covering its syntax, parameters, and usage to create a new DataFrame with added columns while keeping the original DataFrame unchanged.

Key Points –

  • with_columns() is used to add new columns or modify existing columns in a Polars DataFrame.
  • Expressions typically use Polars syntax such as pl.col(), pl.lit(), and arithmetic operations.
  • You can pass expressions either as positional arguments (a list of expressions) or as keyword arguments for named columns
  • Using keyword arguments lets you name new or replaced columns directly without needing alias().
  • You can add constant columns by using pl.lit() within with_columns().
  • Each expression can reference existing columns using pl.col() and perform computations or transformations.

Polars DataFrame with_columns() Introduction

Let’s know the syntax of the DataFrame with_columns() function


# Syntax of DataFrame with_columns()
DataFrame.with_columns(
    *exprs: IntoExpr | Iterable[IntoExpr],
    **named_exprs: IntoExpr,
) → DataFrame

Parameters of the DataFrame with_columns()

Following are the parameters of the DataFrame with_columns() function.

  • *exprs – One or more expressions (or a list of expressions) that create or modify columns.
  • **named_exprs – Optional keyword arguments to define new column names directly with expressions.

Return Value

This function returns a new Polars DataFrame with the specified added or modified columns.

Usage of Polars DataFrame with_columns() Function

The with_columns() method in Polars allows you to add, update, or replace multiple columns in a DataFrame at once, in a fast and efficient way. You pass a list of expressions that specify the columns to be created or changed, and it returns a new DataFrame reflecting those updates, leaving the original DataFrame unchanged.

Now, let’s create a Polars DataFrame.


import polars as pl

# Creating a new Polars DataFrame
technologies = {
    'Courses': ["Spark", "Hadoop", "Hyperion", "Pandas"],
    'Fees': [20000, 25000, 30000, 40000],
    'Duration': ['30days', '50days', '40days', '60days'],
    'Discount': [1000, 1500, 1200, 2500]
}

df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

You can use the with_columns() method in Polars to add a new column to a DataFrame. For instance, here’s how to add a new column called "Trainer" to an existing DataFrame.


# Add new column
df2 = df.with_columns([
    pl.Series("Trainer", ["John", "Steve", "Jeff", "Ravi"])
])
print("DataFrame after adding Trainer column:\n", df2)

Yields below output.

Modify an Existing Column

You can also modify an existing column in Polars using the with_columns() method by providing a new expression for that column’s name. If you add a column with the same name as an existing one, Polars will overwrite the original column with the new values from your expression.


# Apply a 10% discount on Fees
df2 = df.with_columns(
    (pl.col("Fees") * 0.9).alias("Fees")  # Multiply Fees by 0.9 and overwrite 'Fees'
)
print("DataFrame after modifying Fees column:\n", df2)

# Output:
# DataFrame after modifying Fees column:
# shape: (4, 4)
┌──────────┬─────────┬──────────┬──────────┐
│ Courses  ┆ Fees    ┆ Duration ┆ Discount │
│ ---      ┆ ---     ┆ ---      ┆ ---      │
│ str      ┆ f64     ┆ str      ┆ i64      │
╞══════════╪═════════╪══════════╪══════════╡
│ Spark    ┆ 18000.0 ┆ 30days   ┆ 1000     │
│ Hadoop   ┆ 22500.0 ┆ 50days   ┆ 1500     │
│ Hyperion ┆ 27000.0 ┆ 40days   ┆ 1200     │
│ Pandas   ┆ 36000.0 ┆ 60days   ┆ 2500     │
└──────────┴─────────┴──────────┴──────────┘

Here,

  • Using with_columns() with an existing column name updates (modifies) that column.
  • This returns a new DataFrame with the modification.
  • The original DataFrame stays unchanged unless you assign back to the same variable.

Use Keyword Arguments for New Columns

Using keyword arguments in with_columns() is a neat and readable way to add or modify columns by specifying the new column names directly as parameter names.


df2 = df.with_columns(
    Final_Fee = pl.col("Fees") - pl.col("Discount"),
    Fees_Doubled = pl.col("Fees") * 2
)
print("DataFrame after modifications:\n", df2)

# Output:
# DataFrame after modifications:
# shape: (4, 6)
┌──────────┬───────┬──────────┬──────────┬───────────┬──────────────┐
│ Courses  ┆ Fees  ┆ Duration ┆ Discount ┆ Final_Fee ┆ Fees_Doubled │
│ ---      ┆ ---   ┆ ---      ┆ ---      ┆ ---       ┆ ---          │
│ str      ┆ i64   ┆ str      ┆ i64      ┆ i64       ┆ i64          │
╞══════════╪═══════╪══════════╪══════════╪═══════════╪══════════════╡
│ Spark    ┆ 20000 ┆ 30days   ┆ 1000     ┆ 19000     ┆ 40000        │
│ Hadoop   ┆ 25000 ┆ 50days   ┆ 1500     ┆ 23500     ┆ 50000        │
│ Hyperion ┆ 30000 ┆ 40days   ┆ 1200     ┆ 28800     ┆ 60000        │
│ Pandas   ┆ 40000 ┆ 60days   ┆ 2500     ┆ 37500     ┆ 80000        │
└──────────┴───────┴──────────┴──────────┴───────────┴──────────────┘

Here,

  • Use keyword arguments to directly name new or modified columns in with_columns().
  • No need for alias() because the name comes from the argument name.

You can also add or modify columns in Polars using with_columns() by passing keyword arguments directly, where the key is the column name and the value is the expression or Series for the new column.


# Use keyword Arguments for New Columns
df2 = df.with_columns(
    Fees = pl.col("Fees") * 0.9,  # Modify Fees applying 10% discount
    Trainer = pl.Series(["John", "Steve", "Jeff", "Ravi"])  # Add Trainer column
)
print("DataFrame after modifications:\n", df2)

# Output:
# DataFrame after modifications:
# shape: (4, 5)
┌──────────┬─────────┬──────────┬──────────┬─────────┐
│ Courses  ┆ Fees    ┆ Duration ┆ Discount ┆ Trainer │
│ ---      ┆ ---     ┆ ---      ┆ ---      ┆ ---     │
│ str      ┆ f64     ┆ str      ┆ i64      ┆ str     │
╞══════════╪═════════╪══════════╪══════════╪═════════╡
│ Spark    ┆ 18000.0 ┆ 30days   ┆ 1000     ┆ John    │
│ Hadoop   ┆ 22500.0 ┆ 50days   ┆ 1500     ┆ Steve   │
│ Hyperion ┆ 27000.0 ┆ 40days   ┆ 1200     ┆ Jeff    │
│ Pandas   ┆ 36000.0 ┆ 60days   ┆ 2500     ┆ Ravi    │
└──────────┴─────────┴──────────┴──────────┴─────────┘

Add a Constant Column

To add a constant column in Polars, use the with_columns() method along with pl.lit() to assign the constant value. This allows you to easily insert a column with the same value across all rows in the DataFrame.


# Add constant column
df2 = df.with_columns(
    Status = pl.lit("Active")
)
print("DataFrame after adding constant column:\n", df2)

# Add a constant column 'Status'
df2= df.with_columns(
    pl.lit("Active").alias("Status")
)
print("DataFrame with constant column:\n", df2)

# Output:
# DataFrame with constant column:
# shape: (4, 5)
┌──────────┬───────┬──────────┬──────────┬────────┐
│ Courses  ┆ Fees  ┆ Duration ┆ Discount ┆ Status │
│ ---      ┆ ---   ┆ ---      ┆ ---      ┆ ---    │
│ str      ┆ i64   ┆ str      ┆ i64      ┆ str    │
╞══════════╪═══════╪══════════╪══════════╪════════╡
│ Spark    ┆ 20000 ┆ 30days   ┆ 1000     ┆ Active │
│ Hadoop   ┆ 25000 ┆ 50days   ┆ 1500     ┆ Active │
│ Hyperion ┆ 30000 ┆ 40days   ┆ 1200     ┆ Active │
│ Pandas   ┆ 40000 ┆ 60days   ┆ 2500     ┆ Active │
└──────────┴───────┴──────────┴──────────┴────────┘

Here,

  • pl.lit("Active") creates a literal (constant) value.
  • alias("Status") names the new column.

Add Multiple New Columns

You can add multiple new columns to a Polars DataFrame using the with_columns() method by either passing a list of expressions or using keyword arguments. Both approaches allow you to define several columns at once in a clear and efficient way.


# Add multiple new columns
df2 = df.with_columns([
    pl.lit("Online").alias("Platform"),
    pl.Series(["USA", "India", "Canada", "UK"]).alias("Country"),
    (pl.col("Fees") - pl.col("Discount")).alias("Net_Fees")
])
print("DataFrame with multiple new columns:\n", df2)

# Output:
# DataFrame with multiple new columns:
# shape: (4, 7)
┌──────────┬───────┬──────────┬──────────┬──────────┬─────────┬──────────┐
│ Courses  ┆ Fees  ┆ Duration ┆ Discount ┆ Platform ┆ Country ┆ Net_Fees │
│ ---      ┆ ---   ┆ ---      ┆ ---      ┆ ---      ┆ ---     ┆ ---      │
│ str      ┆ i64   ┆ str      ┆ i64      ┆ str      ┆ str     ┆ i64      │
╞══════════╪═══════╪══════════╪══════════╪══════════╪═════════╪══════════╡
│ Spark    ┆ 20000 ┆ 30days   ┆ 1000     ┆ Online   ┆ USA     ┆ 19000    │
│ Hadoop   ┆ 25000 ┆ 50days   ┆ 1500     ┆ Online   ┆ India   ┆ 23500    │
│ Hyperion ┆ 30000 ┆ 40days   ┆ 1200     ┆ Online   ┆ Canada  ┆ 28800    │
│ Pandas   ┆ 40000 ┆ 60days   ┆ 2500     ┆ Online   ┆ UK      ┆ 37500    │
└──────────┴───────┴──────────┴──────────┴──────────┴─────────┴──────────┘

Conclusion

In conclusion, the with_columns() method in Polars is a powerful and flexible way to add, modify, or replace multiple columns in a DataFrame efficiently. Whether you use expressions in a list or keyword arguments, it enables clear and concise transformations while keeping your code readable and performant.

Happy Learning!!

Reference