• Post author:
  • Post category:Polars
  • Post last modified:March 24, 2025
  • Reading time:13 mins read
You are currently viewing Polars DataFrame replace_column() – by Examples

In Polars, the replace_column() method is used to replace an existing column in a DataFrame with a new column. This method is helpful when you want to update or modify a specific column without altering the rest of the DataFrame.

Advertisements

In this article, I will explain the Polars DataFrame replace_column() function, covering its syntax, parameters, and usage. This function creates a new DataFrame with the specified column replaced while keeping the original DataFrame unchanged, as Polars operations are inherently immutable.

Key Points 

  • Used to replace an existing column in a Polars DataFrame with a new column.
  • Requires the column name and the new column data as arguments.
  • The new column must have the same length as the existing DataFrame.
  • It does not modify the original DataFrame but returns a new DataFrame with the updated column.
  • Replaces an existing column in a DataFrame with a new Series or expression.
  •  Replaces an existing column in the DataFrame without creating a new DataFrame.

Polars DataFrame replace_column() Introduction

Let’s know the syntax of the replace_column() function.


# Syntax of replace() method
DataFrame.replace_column(index: int, column: Series) → DataFrame

Parameters of the Polars replace_column()

Following are the parameters of the replace_column() method.

  • index (int) – The position of the column (zero-based index) that should be replaced.
  • column (Series) – A new Polars Series that replaces the existing column at the specified index. The new series must have the same length as the existing column.

Return Value

This function returns a new DataFrame with the specified column replaced (Polars DataFrames are immutable, so operations return new objects rather than modifying in place).

Usage of Polars DataFrame replace_column()

The replace_column() method in Polars allows you to replace an existing column in a DataFrame with a new column while keeping the rest of the DataFrame intact. It replaces the column at the specified index with a new Polars Series.

To run some examples of the Polars DataFrame replace_column() function, let’s create a Polars DataFrame.


import polars as pl

technologies= ({
    'Courses':["spark","python","spark","python","pandas"],
    'Fees' :[22000,25000,22000,25000,24000],
    'Duration':['30days','40days','60days','45days','50days'],
              })
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars replace column

To replace a numeric column (e.g., Fees) in the Polars DataFrame, you can use the replace_column() method.


# New Fees column to replace the existing one
# Replace column at index 1 (Fees column)
new_fees = pl.Series("Fees", [23000, 26000, 23000, 30000, 25000])
df2 = df.replace_column(1, new_fees)
print("Updated DataFrame:\n",df2)

Here,

  • The replace_column() method is used to replace the Fees column with a new set of values (new_fees).
  • The new column (new_fees) must have the same length as the original column.
  • The method returns a new DataFrame with the updated column, leaving the original DataFrame unchanged.
polars replace column

Replacing a String Column

You can replace an existing string column with a new Polars Series using the replace_column() method. This is useful when you need to modify text data, standardize formatting, or update categorical values while keeping the rest of the DataFrame unchanged. This example demonstrates replacing the Courses column with a new set of string values.


# New column to replace "Courses"
new_courses = pl.Series("Courses", ["Java", "C++", "spark", "polars", "Hadoop"])

# Replace the "Courses" column (index 0)
df2= df.replace_column(0, new_courses)
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
# shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ str      │
╞═════════╪═══════╪══════════╡
│ Java    ┆ 22000 ┆ 30days   │
│ C++     ┆ 25000 ┆ 40days   │
│ spark   ┆ 22000 ┆ 60days   │
│ polars  ┆ 25000 ┆ 45days   │
│ Hadoop  ┆ 24000 ┆ 50days   │
└─────────┴───────┴──────────┘

Here,

  • The original Courses column contains the values ["spark", "python", "spark", "python", "pandas"].
  • We replace the Courses column with a new list of string values: ["spark", "java", "spark", "java", "pandas"].
  • The replace_column() method is used to replace the column, and it returns a new DataFrame with the updated column.
  • The original DataFrame remains unchanged.

Replace a Column by Index

The replace_column() method in Polars allows you to replace a column at a specific index with a new Series.


# New column to replace "Fees" (index 1)
new_fees = pl.Series("Fees", [30000, 28000, 27000, 29000, 31000])

# Replace the column at index 1
df2 = df.replace_column(1, new_fees)
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
# shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ str      │
╞═════════╪═══════╪══════════╡
│ spark   ┆ 30000 ┆ 30days   │
│ python  ┆ 28000 ┆ 40days   │
│ spark   ┆ 27000 ┆ 60days   │
│ python  ┆ 29000 ┆ 45days   │
│ pandas  ┆ 31000 ┆ 50days   │
└─────────┴───────┴──────────┘

You can also replace a column by index with a computed series. For example, replace the Duration column (index 2) with a computed series that converts the duration to integers.


# Compute a new series for the "Duration" column (convert to integers)
new_duration = df["Duration"].str.replace("days", "").cast(pl.Int64)

# Replace the third column (index 2) with the computed series
df2 = df.replace_column(2, new_duration)
print("DataFrame after replacing column at index 2 with computed series:\n",df2)

# Output:
# DataFrame after replacing column at index 2 with computed series: shape: (5, 3)
# shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ i64      │
╞═════════╪═══════╪══════════╡
│ spark   ┆ 22000 ┆ 30       │
│ python  ┆ 25000 ┆ 40       │
│ spark   ┆ 22000 ┆ 60       │
│ python  ┆ 25000 ┆ 45       │
│ pandas  ┆ 24000 ┆ 50       │
└─────────┴───────┴──────────┘

Here,

  • Use replace_column() to replace a column by its index.
  • Column indices start at 0.
  • The new column can be a list, Polars Series, or any iterable of the same length.
  • Ensure the new column has the same length as the existing column.

Replacing a Column with a Computed Series

To replace a column in a Polars DataFrame with a computed Series (or expression), you can use the with_columns() method.


# Replace 'Fees' with Fees + 1000
df2 = df.with_columns((pl.col("Fees") + 1000).alias("Fees"))
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
 shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ str      │
╞═════════╪═══════╪══════════╡
│ spark   ┆ 23000 ┆ 30days   │
│ python  ┆ 26000 ┆ 40days   │
│ spark   ┆ 23000 ┆ 60days   │
│ python  ┆ 26000 ┆ 45days   │
│ pandas  ┆ 25000 ┆ 50days   │
└─────────┴───────┴──────────┘

Replace a Column with a New Computed Series

To replace a column with a new computed series in Polars, you can use the with_columns() method along with Polars expressions.


# Replace 'Fees' with Fees * 1.1 (10% increase)
df2 = df.with_columns((pl.col("Fees") * 1.1).alias("Fees"))
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
# shape: (5, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees    ┆ Duration │
│ ---     ┆ ---     ┆ ---      │
│ str     ┆ f64     ┆ str      │
╞═════════╪═════════╪══════════╡
│ spark   ┆ 24200.0 ┆ 30days   │
│ python  ┆ 27500.0 ┆ 40days   │
│ spark   ┆ 24200.0 ┆ 60days   │
│ python  ┆ 27500.0 ┆ 45days   │
│ pandas  ┆ 26400.0 ┆ 50days   │
└─────────┴─────────┴──────────┘

Replace a Column with a Condition

To replace values in a column based on a condition in Polars, you can use the when().then().otherwise() expression inside with_columns() function.


# Replace a column with a condition
df2 = df.with_columns(
    pl.when(pl.col("Courses") == "spark")
      .then(0) 
      .otherwise(pl.col("Fees")) 
      .alias("Fees")  
)
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
# shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ str      │
╞═════════╪═══════╪══════════╡
│ spark   ┆ 0     ┆ 30days   │
│ python  ┆ 25000 ┆ 40days   │
│ spark   ┆ 0     ┆ 60days   │
│ python  ┆ 25000 ┆ 45days   │
│ pandas  ┆ 24000 ┆ 50days   │
└─────────┴───────┴──────────┘

Replace with Multiple Values

You can use replace_column() to update a column with a new Series containing multiple values. This is useful when modifying categorical data, applying transformations, or replacing values dynamically.


# Define a new list of course names
new_courses = pl.Series("Courses", ["Java", "C++", "pandas", "Spark", "Pega"])

# Replace the "Courses" column (index 0) with the new values
df2 = df.replace_column(0, new_courses)
print("Updated DataFrame:\n",df2)

# Output:
# Updated DataFrame:
# shape: (5, 3)
# Updated DataFrame:
 shape: (5, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Duration │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ i64   ┆ str      │
╞═════════╪═══════╪══════════╡
│ Java    ┆ 22000 ┆ 30days   │
│ C++     ┆ 25000 ┆ 40days   │
│ pandas  ┆ 22000 ┆ 60days   │
│ Spark   ┆ 25000 ┆ 45days   │
│ Pega    ┆ 24000 ┆ 50days   │
└─────────┴───────┴──────────┘

Here,

  • Replace an entire column with multiple new values.
  • Ensure the new column has the same length as the original.
  • Returns a new DataFrame (Polars is immutable).

Conclusion

In conclusion, the replace_column() method in Polars is a powerful way to update a specific column in a DataFrame while preserving the rest of the data structure. By replacing a column at a given index with a new Polars Series, it allows for seamless transformations and modifications without affecting the entire DataFrame.

References