• Post author:
  • Post category:Polars
  • Post last modified:March 7, 2025
  • Reading time:11 mins read
You are currently viewing Convert Polars Cast Integer to Float

In Polars, you can convert an integer column to a float type using the cast() function or the with_columns() method. This function allows you to transform integer columns into floating-point numbers, making it beneficial for mathematical computations, machine learning tasks, and ensuring precision in calculations. In this article, I will explain how to convert an integer to float (Float64) using the cast() function in Polars.

Advertisements

Key Points –

  • Use cast(pl.Float64) to convert an integer column (i64) to a float (f64).
  • pl.Float32 uses 32-bit precision, making it memory-efficient but slightly less precise.
  • The with_columns() method is used to apply the cast operation and update the DataFrame.
  • Polars supports multiple float types, including Float32 and Float64.
  • The select() method can also be used to cast a column without modifying the original DataFrame.
  • Casting is necessary when performing floating-point operations, such as division, statistical calculations, or scientific computations.
  • Always choose the appropriate float type based on precision and memory constraints for efficient data processing.

Usage of Polars Cast Integer to Float

You can convert an integer column to a float using the cast() method in Polars. This is particularly useful for ensuring floating-point precision in calculations, preventing truncation errors, and maintaining consistency in datasets. Using cast(pl.Float64), you can store numerical data in a floating-point format, which is useful when precision is needed (e.g., division operations).

First, let’s create a Polars DataFrame.


import polars as pl

technologies= ({
   'Courses':["Spark","PySpark","Hadoop","Pandas"],
    'Fees' :[22000,25000,24000,26000],
    'Discount':[1000,2300,2500,1400]
              })
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars cast integer float

You can use the with_columns() method along with the cast(pl.Float64) function to convert a single integer column to a float type in Polars.


# Convert 'Fees' column to Float
df2 = df.with_columns(df["Fees"].cast(pl.Float64))
print("After converting 'Fees' to float:\n", df2)

Here,

  • The "Fees" column was initially i64 (integer).
  • Using cast(pl.Float64), we converts it from i64 (integer) to f64 (float).
  • The "Discount" column remains unchanged (i64).
  • with_columns() updates the DataFrame with the modified column.
polars cast integer float

Convert Multiple Integer Columns to Float

Alternatively, to efficiently convert multiple integer columns to float in Polars, you can use the with_columns() function along with the cast(pl.Float64) method. Instead of manually selecting columns, pl.col(pl.Int64) allows for automatic selection of all integer columns, making it especially useful for large datasets.


# Convert multiple integer columns to float using pl.col()
df = df.with_columns(pl.col(["Fees", "Discount"]).cast(pl.Float64))
print("Updated DataFrame:\n", df)

# Convert multiple columns to float
df = df.with_columns(df.select(["Fees", "Discount"]).cast(pl.Float64))
print("Updated DataFrame:\n", df)

# Output:
# Updated DataFrame:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees    ┆ Discount │
│ ---     ┆ ---     ┆ ---      │
│ str     ┆ f64     ┆ f64      │
╞═════════╪═════════╪══════════╡
│ Spark   ┆ 22000.0 ┆ 1000.0   │
│ PySpark ┆ 25000.0 ┆ 2300.0   │
│ Hadoop  ┆ 24000.0 ┆ 2500.0   │
│ Pandas  ┆ 26000.0 ┆ 1400.0   │
└─────────┴─────────┴──────────┘

Here,

  • pl.col(["col1", "col2"]) selects specific columns dynamically.
  • pl.col(pl.Int64) selects all integer columns for transformation.
  • cast(pl.Float64) converts columns to 64-bit float (use Float32 for lower precision).
  • with_columns() ensures the DataFrame updates the column types.

Using pl.Float32 for Lower Precision

You can convert a single integer column to lower precision float (Float32) instead of the default 64-bit float (pl.Float64). Using pl.Float32 is beneficial when you need to optimize memory usage while maintaining decimal precision.


# Convert the "Fees" column to Float32
df2 = df.with_columns(df["Fees"].cast(pl.Float32))
print("Updated DataFrame:\n", df2)

# Using pl.Float32 for lower precision
df2 = df.with_columns(pl.col(["Fees"]).cast(pl.Float32))
print("Updated DataFrame:\n", df2)

# Output:
# Updated DataFrame:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees    ┆ Discount │
│ ---     ┆ ---     ┆ ---      │
│ str     ┆ f32     ┆ i64      │
╞═════════╪═════════╪══════════╡
│ Spark   ┆ 22000.0 ┆ 1000     │
│ PySpark ┆ 25000.0 ┆ 2300     │
│ Hadoop  ┆ 24000.0 ┆ 2500     │
│ Pandas  ┆ 26000.0 ┆ 1400     │
└─────────┴─────────┴──────────┘

Here,

  • The "Fees" column was initially i64 (integer).
  • We used cast(pl.Float32) to convert it to f32 (lower precision float).
  • The "Discount" column remains unchanged as i64.

Casting Integer to Float Using select()

You can use the select() method to cast an integer column to float in polars. This method allows you to transform specific columns while keeping the DataFrame structure intact.


# Convert "Fees" column to Float using select()
df2 = df.select([df["Courses"], df["Fees"].cast(pl.Float64), df["Discount"]])
print(df2)

# Output:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees    ┆ Discount │
│ ---     ┆ ---     ┆ ---      │
│ str     ┆ f64     ┆ i64      │
╞═════════╪═════════╪══════════╡
│ Spark   ┆ 22000.0 ┆ 1000     │
│ PySpark ┆ 25000.0 ┆ 2300     │
│ Hadoop  ┆ 24000.0 ┆ 2500     │
│ Pandas  ┆ 26000.0 ┆ 1400     │
└─────────┴─────────┴──────────┘

Here,

  • select() allows column-wise transformations while keeping the DataFrame structure.
  • df["Fees"].cast(pl.Float64) converts the "Fees" column to Float64.
  • Other columns (Courses, Discount) remain unchanged.

Similarly, you can use select() to transform and return specific columns instead of modifying the entire Polars DataFrame. This method is useful when you only need certain columns in the output.


# Use select() to cast the 'Fees' column to float
df2 = df.select([pl.col("Fees").cast(pl.Float64)])
print(df2)

# Output:
# shape: (4, 1)
┌─────────┐
│ Fees    │
│ ---     │
│ f64     │
╞═════════╡
│ 22000.0 │
│ 25000.0 │
│ 24000.0 │
│ 26000.0 │
└─────────┘

Convert Cast Negative Integer to Float

Finally, to convert a negative integer column to float in Polars, you can use the cast(pl.Float64) method, ensuring that negative values maintain their decimal precision. Polars provides multiple casting methods to achieve this conversion efficiently.


# Convert negative integers to float using cast()
df2 = df.with_columns([
    df["Fees"].cast(pl.Float64),
    df["Discount"].cast(pl.Float32)
])
print("After Casting:\n", df2)

# Output:
# After Casting:
# shape: (3, 3)
┌────────┬──────────┬──────────┐
│ Course ┆ Fees     ┆ Discount │
│ ---    ┆ ---      ┆ ---      │
│ str    ┆ f64      ┆ f32      │
╞════════╪══════════╪══════════╡
│ Spark  ┆ -22000.0 ┆ -1000.0  │
│ Hadoop ┆ -25000.0 ┆ -2300.0  │
│ Pandas ┆ -24000.0 ┆ -2500.0  │
└────────┴──────────┴──────────┘

Here,

  • Create a DataFrame with two columns ("Fees", "Discount") containing negative integers.
  • cast(pl.Float64) converts the "Fees" column from i64 (integer) to f64 (float).
  • cast(pl.Float32) converts the "Discount" column from i64 (integer) to f32 (float).
  • with_columns() updates the DataFrame with the converted column.

Conclusion

In conclusion, converting an integer column to a float in Polars is simple and efficient using the cast() method. This conversion is essential when working with numerical data that requires decimal precision, preventing truncation in calculations.

Happy Learning!!

References