• Post author:
  • Post category:Polars
  • Post last modified:February 17, 2025
  • Reading time:14 mins read
You are currently viewing Convert Polars Cast Int to String

In Polars, you can use the cast() function to convert an integer column to a string (Utf8). This is helpful when you need to transform numeric data into string format for tasks like text manipulation, concatenation, or exporting the data. To cast an integer column to a string, you can apply the cast() function with either the with_columns() method or the select() method. In this article, I will explain how to convert an integer to a string (Utf8) using the cast() function in Polars.

Advertisements

Key Points –

  • The cast(pl.Utf8) method converts an integer column to a string (Utf8) type in Polars.
  • You can cast a single column or multiple columns simultaneously using with_columns() or select().
  • The alias("New_Column_Name") method helps rename the column after casting.
  • Using select([pl.col("column_name").cast(pl.Utf8)]) creates a transformed DataFrame with only selected columns.
  • The with_columns() method allows modifying the DataFrame by adding new transformed columns.
  • Negative integer values are converted to their string representations without issues (e.g., -1000 becomes "-1000").
  • Casting can be used inside expressions for operations like concatenation, filtering, or formatting.
  • Polars follows an immutable data paradigm, so casting returns a new DataFrame rather than modifying the existing one.

Usage of Polars Cast Int to String

The cast() function in Polars is used to convert a column from one data type to another. When converting an integer column to a string, the function allows you to change the column’s data type to Utf8, which is Polars’ representation of a string.

First, let’s create a Polars DataFrame.


import polars as pl

technologies= ({
   'Courses':["Spark","PySpark","Hadoop","Pandas"],
    'Fees' :[22000,25000,24000,26000],
    'Discount':[1000,2300,2500,1400]
              })
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars cast int string

You can use the with_columns() method along with the pl.col().cast() function to convert a single column from an integer to a string in Polars.


# Casting the 'Fees' column to a string
df_casted = df.with_columns(
    pl.col("Fees").cast(pl.Utf8).alias("Fees"))
print("\nDataFrame after casting 'Fees' to string:\n", df_casted)

# Output:
# DataFrame after casting 'Fees' to string:
# shape: (4, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Discount │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ str   ┆ i64      │
╞═════════╪═══════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     │
│ PySpark ┆ 25000 ┆ 2300     │
│ Hadoop  ┆ 24000 ┆ 2500     │
│ Pandas  ┆ 26000 ┆ 1400     │
└─────────┴───────┴──────────┘

Here,

  • pl.col("Fees") – Selects the column "Fees".
  • cast(pl.Utf8) – Casts the column to a UTF-8 string data type.
  • alias("Fees") – Renames the column to ensure its name remains "Fees".
  • with_columns() – Replaces the original column with the newly cast column.
polars cast int string

Cast Multiple Columns from Int to String

To cast multiple columns from integers to strings in Polars, you can use the with_columns() method along with pl.col().cast(). The process involves selecting multiple columns and applying the casting operation to all of them.


# Casting 'Fees' and 'Discount' columns to strings
df_casted = df.with_columns(
    [pl.col(col).cast(pl.Utf8).alias(col) for col in ["Fees", "Discount"]])
print("\nDataFrame after casting 'Fees' and 'Discount' to strings:\n", df_casted)

# Output:
# DataFrame after casting 'Fees' and 'Discount' to strings:
# shape: (4, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Discount │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ str   ┆ str      │
╞═════════╪═══════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     │
│ PySpark ┆ 25000 ┆ 2300     │
│ Hadoop  ┆ 24000 ┆ 2500     │
│ Pandas  ┆ 26000 ┆ 1400     │
└─────────┴───────┴──────────┘

Here,

  • Selecting Multiple Columns – The list comprehension [pl.col(col).cast(pl.Utf8).alias(col) for col in ["Fees", "Discount"]] generates the necessary transformations for each specified column.
  • pl.col(col).cast(pl.Utf8) – Casts each column to a UTF-8 string data type.
  • .alias(col) – Ensures that the column name remains the same after casting.
  • with_columns() – Applies the transformations to the DataFrame.

Cast int64 to String

To cast a column of type int64 to a string in Polars, you can use the cast() function, specifying the Utf8 data type for the string conversion.


# Cast 'Fees' and 'Discount' columns from int64 to string
df_casted = df.with_columns([
    pl.col("Fees").cast(pl.Utf8).alias("Fees"),
    pl.col("Discount").cast(pl.Utf8).alias("Discount")])
print("DataFrame after casting int64 to string:\n", df_casted)

# Output:
# DataFrame after casting int64 to string:
# shape: (4, 3)
┌─────────┬───────┬──────────┐
│ Courses ┆ Fees  ┆ Discount │
│ ---     ┆ ---   ┆ ---      │
│ str     ┆ str   ┆ str      │
╞═════════╪═══════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     │
│ PySpark ┆ 25000 ┆ 2300     │
│ Hadoop  ┆ 24000 ┆ 2500     │
│ Pandas  ┆ 26000 ┆ 1400     │
└─────────┴───────┴──────────┘

Here,

  • pl.col("Fees") – Selects the Fees column.
  • cast(pl.Utf8) – Casts the Fees column from int64 to string (Utf8).
  • alias("Fees") – Retains the name of the column as Fees after casting.
  • with_columns() – Applies the cast transformation to both Fees and Discount columns at the same time.

Cast Int Column and Rename

To cast an integer column to a string and rename it in Polars, you can use the cast() method and the alias() method together inside the with_columns() function. Here’s how you can cast an integer column (like Fees) to a string and rename it (for example, to Fees_Str).


# Cast 'Fees' column from int to string and rename it to 'Fees_Str'
df_casted = df.with_columns(
    pl.col("Fees").cast(pl.Utf8).alias("Fees_Str"))
print("DataFrame after casting 'Fees' to string and renaming it to 'Fees_Str':\n", df_casted)

# Output:
# DataFrame after casting 'Fees' to string and renaming it to 'Fees_Str':
# shape: (4, 4)
┌─────────┬───────┬──────────┬──────────┐
│ Courses ┆ Fees  ┆ Discount ┆ Fees_Str │
│ ---     ┆ ---   ┆ ---      ┆ ---      │
│ str     ┆ i64   ┆ i64      ┆ str      │
╞═════════╪═══════╪══════════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     ┆ 22000    │
│ PySpark ┆ 25000 ┆ 2300     ┆ 25000    │
│ Hadoop  ┆ 24000 ┆ 2500     ┆ 24000    │
│ Pandas  ┆ 26000 ┆ 1400     ┆ 26000    │
└─────────┴───────┴──────────┴──────────┘

Here,

  • pl.col("Fees") – Selects the Fees column.
  • cast(pl.Utf8) – Casts the Fees column from integer (int64) to string (Utf8).
  • alias("Fees_Str") – Renames the casted column to Fees_Str.
  • with_columns() – Applies the casting and renaming transformation to the Fees column.

Cast Int to String Using select() and alias()

You can use this approach to transform specific columns without altering the original DataFrame. The select() method, combined with alias(), allows you to cast an integer column to a string in Polars.


# Cast 'Fees' column from int to string 
# Using select() and alias()
df_casted = df.select([
    pl.col("Courses"),
    pl.col("Fees").cast(pl.Utf8).alias("Fees_Str"),
    pl.col("Discount")])
print("DataFrame after casting 'Fees' to string using select() and alias():\n", df_casted)

# Output:
# DataFrame after casting 'Fees' to string using select() and alias():
# shape: (4, 3)
┌─────────┬──────────┬──────────┐
│ Courses ┆ Fees_Str ┆ Discount │
│ ---     ┆ ---      ┆ ---      │
│ str     ┆ str      ┆ i64      │
╞═════════╪══════════╪══════════╡
│ Spark   ┆ 22000    ┆ 1000     │
│ PySpark ┆ 25000    ┆ 2300     │
│ Hadoop  ┆ 24000    ┆ 2500     │
│ Pandas  ┆ 26000    ┆ 1400     │
└─────────┴──────────┴──────────┘

Here,

  • select() – Extracts only the specified columns.
  • pl.col("Fees").cast(pl.Utf8).alias("Fees_Str") – Converts Fees from int64 to Utf8 (string). Renames it to "Fees_Str".
  • Preserving Other Columns – The "Courses" and "Discount" columns remain unaffected.

Cast Negative Int Values to String

To cast negative integer values to strings in Polars, you can use the cast(pl.Utf8) function. Below is an example where we modify the dataset to include negative values and then cast them to strings.


import polars as pl

# Sample data with negative integer values
technologies = {
   'Courses': ["Spark", "PySpark", "Hadoop", "Pandas"],
   'Fees': [-22000, -25000, -24000, -26000],  # Negative values
   'Discount': [-1000, -2300, -2500, -1400]  # Negative values
}
df = pl.DataFrame(technologies)

# Cast 'Fees' and 'Discount' columns from int to string
df_casted = df.with_columns([
    pl.col("Fees").cast(pl.Utf8).alias("Fees"),
    pl.col("Discount").cast(pl.Utf8).alias("Discount")])
print("DataFrame after casting negative int64 values to string:\n", df_casted)

# Output:
# DataFrame after casting negative int64 values to string:
 # shape: (4, 3)
┌─────────┬────────┬──────────┐
│ Courses ┆ Fees   ┆ Discount │
│ ---     ┆ ---    ┆ ---      │
│ str     ┆ str    ┆ str      │
╞═════════╪════════╪══════════╡
│ Spark   ┆ -22000 ┆ -1000    │
│ PySpark ┆ -25000 ┆ -2300    │
│ Hadoop  ┆ -24000 ┆ -2500    │
│ Pandas  ┆ -26000 ┆ -1400    │
└─────────┴────────┴──────────┘

Here,

  • Negative Values – The Fees and Discount columns contain negative integer values.
  • pl.col("Fees").cast(pl.Utf8).alias("Fees") – Converts the Fees column from int64 to string (Utf8).
  • pl.col("Discount").cast(pl.Utf8).alias("Discount") – Converts the Discount column from int64 to string (Utf8).
  • with_columns() – Applies these transformations to modify the DataFrame.

Conclusion

In conclusion, polars cast an int (integer) column to a string using the cast() function is a simple and efficient way to transform your data for various operations like text manipulation, concatenation, or export. You can perform this conversion using the with_columns() or select() methods, depending on whether you want to modify the existing DataFrame or select specific columns.

Happy Learning!!

References