In Polars, you can convert an integer column to a float type using the cast()
function or the with_columns()
method. This function allows you to transform integer columns into floating-point numbers, making it beneficial for mathematical computations, machine learning tasks, and ensuring precision in calculations. In this article, I will explain how to convert an integer to float (Float64
) using the cast()
function in Polars.
Key Points –
- Use
cast(pl.Float64)
to convert an integer column (i64
) to a float (f64
). pl.Float32
uses 32-bit precision, making it memory-efficient but slightly less precise.- The
with_columns()
method is used to apply the cast operation and update the DataFrame. - Polars supports multiple float types, including
Float32
andFloat64
. - The
select()
method can also be used to cast a column without modifying the original DataFrame. - Casting is necessary when performing floating-point operations, such as division, statistical calculations, or scientific computations.
- Always choose the appropriate float type based on precision and memory constraints for efficient data processing.
Usage of Polars Cast Integer to Float
You can convert an integer column to a float using the cast() method in Polars. This is particularly useful for ensuring floating-point precision in calculations, preventing truncation errors, and maintaining consistency in datasets. Using cast(pl.Float64)
, you can store numerical data in a floating-point format, which is useful when precision is needed (e.g., division operations).
First, let’s create a Polars DataFrame.
import polars as pl
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas"],
'Fees' :[22000,25000,24000,26000],
'Discount':[1000,2300,2500,1400]
})
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)
Yields below output.
You can use the with_columns() method along with the cast(pl.Float64)
function to convert a single integer column to a float type in Polars.
# Convert 'Fees' column to Float
df2 = df.with_columns(df["Fees"].cast(pl.Float64))
print("After converting 'Fees' to float:\n", df2)
Here,
- The
"Fees"
column was initiallyi64
(integer). - Using
cast(pl.Float64)
, we converts it fromi64
(integer) tof64
(float). - The
"Discount"
column remains unchanged (i64
). with_columns()
updates the DataFrame with the modified column.
Convert Multiple Integer Columns to Float
Alternatively, to efficiently convert multiple integer columns to float in Polars, you can use the with_columns()
function along with the cast(pl.Float64)
method. Instead of manually selecting columns, pl.col(pl.Int64)
allows for automatic selection of all integer columns, making it especially useful for large datasets.
# Convert multiple integer columns to float using pl.col()
df = df.with_columns(pl.col(["Fees", "Discount"]).cast(pl.Float64))
print("Updated DataFrame:\n", df)
# Convert multiple columns to float
df = df.with_columns(df.select(["Fees", "Discount"]).cast(pl.Float64))
print("Updated DataFrame:\n", df)
# Output:
# Updated DataFrame:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees ┆ Discount │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 │
╞═════════╪═════════╪══════════╡
│ Spark ┆ 22000.0 ┆ 1000.0 │
│ PySpark ┆ 25000.0 ┆ 2300.0 │
│ Hadoop ┆ 24000.0 ┆ 2500.0 │
│ Pandas ┆ 26000.0 ┆ 1400.0 │
└─────────┴─────────┴──────────┘
Here,
pl.col(["col1", "col2"])
selects specific columns dynamically.pl.col(pl.Int64)
selects all integer columns for transformation.cast(pl.Float64)
converts columns to 64-bit float (useFloat32
for lower precision).with_columns()
ensures the DataFrame updates the column types.
Using pl.Float32 for Lower Precision
You can convert a single integer column to lower precision float (Float32
) instead of the default 64-bit float (pl.Float64
). Using pl.Float32
is beneficial when you need to optimize memory usage while maintaining decimal precision.
# Convert the "Fees" column to Float32
df2 = df.with_columns(df["Fees"].cast(pl.Float32))
print("Updated DataFrame:\n", df2)
# Using pl.Float32 for lower precision
df2 = df.with_columns(pl.col(["Fees"]).cast(pl.Float32))
print("Updated DataFrame:\n", df2)
# Output:
# Updated DataFrame:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees ┆ Discount │
│ --- ┆ --- ┆ --- │
│ str ┆ f32 ┆ i64 │
╞═════════╪═════════╪══════════╡
│ Spark ┆ 22000.0 ┆ 1000 │
│ PySpark ┆ 25000.0 ┆ 2300 │
│ Hadoop ┆ 24000.0 ┆ 2500 │
│ Pandas ┆ 26000.0 ┆ 1400 │
└─────────┴─────────┴──────────┘
Here,
- The
"Fees"
column was initiallyi64
(integer). - We used
cast(pl.Float32)
to convert it tof32
(lower precision float). - The
"Discount"
column remains unchanged asi64
.
Casting Integer to Float Using select()
You can use the select() method to cast an integer column to float in polars. This method allows you to transform specific columns while keeping the DataFrame structure intact.
# Convert "Fees" column to Float using select()
df2 = df.select([df["Courses"], df["Fees"].cast(pl.Float64), df["Discount"]])
print(df2)
# Output:
# shape: (4, 3)
┌─────────┬─────────┬──────────┐
│ Courses ┆ Fees ┆ Discount │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ i64 │
╞═════════╪═════════╪══════════╡
│ Spark ┆ 22000.0 ┆ 1000 │
│ PySpark ┆ 25000.0 ┆ 2300 │
│ Hadoop ┆ 24000.0 ┆ 2500 │
│ Pandas ┆ 26000.0 ┆ 1400 │
└─────────┴─────────┴──────────┘
Here,
select()
allows column-wise transformations while keeping the DataFrame structure.df["Fees"].cast(pl.Float64)
converts the"Fees"
column toFloat64
.- Other columns (
Courses
,Discount
) remain unchanged.
Similarly, you can use select()
to transform and return specific columns instead of modifying the entire Polars DataFrame. This method is useful when you only need certain columns in the output.
# Use select() to cast the 'Fees' column to float
df2 = df.select([pl.col("Fees").cast(pl.Float64)])
print(df2)
# Output:
# shape: (4, 1)
┌─────────┐
│ Fees │
│ --- │
│ f64 │
╞═════════╡
│ 22000.0 │
│ 25000.0 │
│ 24000.0 │
│ 26000.0 │
└─────────┘
Convert Cast Negative Integer to Float
Finally, to convert a negative integer column to float in Polars, you can use the cast(pl.Float64)
method, ensuring that negative values maintain their decimal precision. Polars provides multiple casting methods to achieve this conversion efficiently.
# Convert negative integers to float using cast()
df2 = df.with_columns([
df["Fees"].cast(pl.Float64),
df["Discount"].cast(pl.Float32)
])
print("After Casting:\n", df2)
# Output:
# After Casting:
# shape: (3, 3)
┌────────┬──────────┬──────────┐
│ Course ┆ Fees ┆ Discount │
│ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f32 │
╞════════╪══════════╪══════════╡
│ Spark ┆ -22000.0 ┆ -1000.0 │
│ Hadoop ┆ -25000.0 ┆ -2300.0 │
│ Pandas ┆ -24000.0 ┆ -2500.0 │
└────────┴──────────┴──────────┘
Here,
- Create a DataFrame with two columns (
"Fees"
,"Discount"
) containing negative integers. cast(pl.Float64)
converts the"Fees"
column fromi64
(integer) tof64
(float).cast(pl.Float32)
converts the"Discount"
column fromi64
(integer) tof32
(float).with_columns()
updates the DataFrame with the converted column.
Conclusion
In conclusion, converting an integer column to a float in Polars is simple and efficient using the cast()
method. This conversion is essential when working with numerical data that requires decimal precision, preventing truncation in calculations.
Happy Learning!!
Related Articles
- Convert Polars Cast Float to Integer
- Convert Polars Cast Float to String
- Convert Polars Cast Int to String
- Convert Polars Cast String to Float
- Polars Convert Cast String to Integer
- Polars DataFrame count() Function
- Polars DataFrame limit() Method
- Polars DataFrame row() Usage & Examples