In Polars, you can convert a DataFrame to a Series by selecting a specific column. Each column in a Polars DataFrame is inherently a Series. A DataFrame in Polars is a two-dimensional structure with rows and columns, similar to a table, whereas a Series is a one-dimensional structure representing a single column of data. Since a DataFrame cannot be directly converted into a Series, the typical approach is to extract an individual column from the DataFrame. In this article, I will explain how to convert a DataFrame to a Series in Polars.
Key Points –
- Use square brackets (
df["column_name"]
) to extract a column as a Series by its name. - Use
df.get_column("column_name")
to explicitly retrieve a column as a Series. - Use slicing syntax (
df[:, index]
) to extract a column by its index (position). - Use
df.select("column_name").to_series()
to select a column and convert it to a Series. select()
returns a DataFrame, so useto_series()
if you need a Series.- The extracted Series maintains the same data type as in the original DataFrame.
- If needed, use
to_frame()
to convert the Series back into a DataFrame. - Direct column access (
df["column_name"]
) is the most efficient and commonly used method.
Usage of Convert Polars DataFrame to Series
In Polars, a DataFrame is a collection of Series (columns), and converting a specific column from a DataFrame to a Series is a common operation.
What is a Series in Polars?
A Series in Polars is a single column of data with a specific data type (e.g., integers, strings, floats). It is similar to a pandas Series or a single column in a spreadsheet.
To run some examples of converting polars DataFrame to Series, let’s create a Polars DataFrame.
import polars as pl
technologies= ({
'Courses':["spark","python","spark","python","pandas"],
'Fees' :[22000,25000,22000,25000,24000],
'Duration':['30days','40days','60days','40days','50days'],
'Discount':[1000,1500,1000,2000,2500]
})
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)
Yields below output.
You can convert a single column of a Polars DataFrame into a Series using square bracket notation (df["column_name"])
. This method allows you to access a column by its name, returning it as a Series.
# Convert 'Courses' column to a Series
df2 = df["Courses"]
print("Courses Series:\n", df2)
Here,
- Replace
column_name
with the actual name of the column you want to extract. - A Series object containing the data from the specified column.
- This method is very efficient and is the recommended way to extract a column as a Series.
Convert a Column by Index
You can convert a column to a Series by referencing its index position in the DataFrame. This is useful when the column name is unknown but its position is known. Instead of using the column name, you can access and convert the column using its index.
# Convert the first column (index 0) to a Series
df2 = df[:, 1]
print("Column Series:\n", df2)
# Output:
# Column Series:
shape: (5,)
Series: 'Fees' [i64]
[
22000
25000
22000
25000
24000
]
Here,
df[:, 1]
selects the second column (1-based index).- A Series object containing the data from the specified column.
- This method is efficient and works well when you know the column’s position.
Convert a Column with Select() Method
You can use the select()
method to choose a specific column from a Polars DataFrame and then convert it to a Series using the to_series()
method. This approach is useful when you want to work with the select() method for column selection and then explicitly convert the result to a Series.
# Convert the 'Fees' column to a Series
# using select() function
df2 = df.select("Fees").to_series()
print("Fees Series:\n", df2)
Here,
df.select("Fees")
returns a DataFrame with one column..to_series()
converts that single-column DataFrame into a Series.- A Series object containing the data from the specified column.
- This method is efficient and works well when you are already using select for other operations
Yields the same output as above.
Convert a Column with get_column() Method
The get_column()
method provides a simple way to extract a specific column from a DataFrame as a Series. It explicitly retrieves a column by name and returns it as a Series. You can use get_column()
to convert a column into a Series in Polars.
# Convert the 'Discount' column
# To a Series using get_column()
df2 = df.get_column("Discount")
print("Discount Series:\n", df2)
# Output:
# Discount Series:
shape: (5,)
Series: 'Discount' [i64]
[
1000
1500
1000
2000
2500
]
Here,
get_column("ColumnName")
extracts a column directly as a Series.- This is faster than the
select()
+to_series()
function. - A Series object containing the data from the specified column.
- This method is efficient and explicitly designed for extracting a single column as a Series.
Convert a Column with to_series() Method
You can use the to_series()
method to convert a column from a polars DataFrame into a Series. This method is typically used after selecting a column or when you already have a reference to a column.
# Convert the 'Duration' column to a Series using to_series()
df2 = df.select("Duration").to_series()
print("Duration Series:\n", df2)
# Output:
# Duration Series:
shape: (5,)
Series: 'Duration' [str]
[
"30days"
"40days"
"60days"
"40days"
"50days"
]
Here,
select("Duration")
returns a DataFrame with one column.to_series()
converts the single-column DataFrame into a Series.- Use this method when you want to explicitly convert a column reference to a Series.
- It is particularly useful when you are working with column references and want to ensure the result is a Series.
Extracting and Performing Element-wise Operations
You can extract a column as a Series and apply element-wise operations using vectorized functions. These operations are performed on each element of the Series individually, allowing efficient computations.
# Extract the 'Fees' column as a Series
fees_series = df.get_column("Fees")
# Perform element-wise operation: Apply a 10% discount
df2 = fees_series * 0.9
print("Discounted Fees:\n", df2)
# Output
# Discounted Fees:
shape: (5,)
Series: 'Fees' [f64]
[
19800.0
22500.0
19800.0
22500.0
21600.0
]
Here,
df.get_column("Fees")
extracts the Fees column as a Series.- The operation
fees_series * 0.9
applies a 10% discount element-wise. - The result is a new Series with discounted values.
Conclusion
In summary, converting a Polars DataFrame column into a Series is a simple and efficient process that enables flexible data manipulation. Whether using bracket notation (df["column_name"]
), get_column()
, or to_series()
, each method offers an optimized approach to extracting and working with individual columns.
Happy Learning!!
Related Articles
- Polars Filter by Column Value
- Convert Polars String to Integer
- Polars Sum Multiple Columns
- How to Drop Row in Polars
- Select Polars Columns by Index
- Convert Polars Cast Integer to Float
- Convert Polars Cast Float to Integer
- How to drop a column using Polars
- Add New Columns to Polars DataFrame
- How to Select Columns by Data Type in Polars
- Polars Sum DataFrame Columns With Examples
- Polars DataFrame shape – Explained by Examples
- How to Convert a Polars DataFrame to Python List?