In Polars, transpose()
function is used to transpose rows into columns and columns into rows in a given DataFrame. This function swaps the rows and columns of the DataFrame, so the columns become rows and vice versa.
In this article, I will explain the concept of Polars transpose()
function and using this syntax and parameters, and show how it can be used to return a transposed DataFrame, where rows are converted into columns and columns into rows.
Key Points –
- The
transpose()
method switches the rows and columns of a Polars DataFrame, effectively rotating the data. - By default,
transpose()
creates a new DataFrame with rows converted to columns and columns converted to rows. - Use the
include_header
parameter set toTrue
to include the original column names as a row in the transposed DataFrame. - When
include_header=True
, theheader_name
parameter allows you to specify the name of the column containing the original headers. - The
column_names
parameter allows you to define custom names for the new columns in the transposed DataFrame. - The
transpose()
method provides flexibility to include headers, rename columns, and customize the transposed DataFrame layout. - The method returns a new transposed DataFrame, leaving the original DataFrame unmodified.
- Transposing may result in type conversion since Polars aligns all values in a column to a uniform data type.
Syntax of Polars DataFrame transpose()
Let’s know the syntax of the Polars DataFrame transpose() method.
# Syntax of transpose()
DataFrame.transpose(
*,
include_header: bool = False,
header_name: str = "column",
column_names: str | Iterable[str] | None = None
) → DataFrame
Parameters of the Polars DataFrame.transpose()
Following are the parameters of the transpose()
method.
include_header
– (default:False
)- If
True
, includes the original column names as a row in the transposed DataFrame.
- If
header_name
– (default:"column"
)- The name assigned to the new column that contains the original column names when
include_header=True
.
- The name assigned to the new column that contains the original column names when
column_names
– (default:None
)- Defines custom names for the columns of the transposed DataFrame.
- It can be a single string or an iterable of strings. If not provided, Polars generates default column names (
column_0
,column_1
, etc.).
Return Value
This function return a transposed DataFrame, where rows become columns and columns become rows.
Polars transpose() Usage
The transpose()
method in Polars is used to flip the rows and columns of a DataFrame. This means that the rows in the original DataFrame become the columns in the transposed DataFrame, and vice versa. You can also customize the resulting transposed DataFrame by specifying certain options.
To run some examples of transpose DataFrame in Polars, let’s create a Polars DataFrame with a Python dictionary of lists.
import polars as pl
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','35days', '40days','35days'],
'Discount':[1000,2300,1000,1200,2500]
}
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)
Yields below output.
A basic transpose of a DataFrame involves swapping its rows and columns, typically without including the original column headers in the result.
# Transpose the DataFrame
transposed_df = df.transpose()
print("Transposed DataFrame:\n", transposed_df)
In the above example, df.transpose()
switches the rows and columns of the DataFrame. By default, Polars assigns column_0
, column_1
, etc., as column names for the transposed DataFrame.
Adding Column Names as Headers
To include the original column names as headers in the transposed DataFrame, you can use the include_header=True
parameter in the transpose()
method.
# Transpose the DataFrame with column names as headers
transposed_df = df.transpose(include_header=True)
print("Transposed DataFrame with Column Names as Headers:\n", transposed_df)
In the above example, setting include_header=True
ensures that the original column names are preserved in the transposed DataFrame. This way, the column names from the original DataFrame are used as headers for the rows in the new structure.
Alternatively, to add the original column names as a row in the transposed DataFrame, you can use the include_header=True
parameter in the transpose()
method. You can also set a custom name for the column that holds the headers by using the header_name
parameter.
# Transpose with Column Names as Headers
transposed_with_headers = df.transpose(include_header=True, header_name="Original_Columns")
print("Transposed DataFrame with Column Names as Headers:\n", transposed_with_headers)
# Output:
# Transposed DataFrame with Column Names as Headers:
# shape: (4, 6)
┌──────────────────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ Original_Columns ┆ column_0 ┆ column_1 ┆ column_2 ┆ column_3 ┆ column_4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ str ┆ str │
╞══════════════════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ Courses ┆ Spark ┆ PySpark ┆ Hadoop ┆ Python ┆ Pandas │
│ Fee ┆ 22000 ┆ 25000 ┆ 23000 ┆ 24000 ┆ 26000 │
│ Duration ┆ 30days ┆ 50days ┆ 35days ┆ 40days ┆ 35days │
│ Discount ┆ 1000 ┆ 2300 ┆ 1000 ┆ 1200 ┆ 2500 │
└──────────────────┴──────────┴──────────┴──────────┴──────────┴──────────┘
Transposing with Specific Column Names
To transpose a Polars DataFrame with specific column names in the transposed result, you can use the column_names
parameter in the transpose()
method. This allows you to specify a custom set of column names for the transposed DataFrame.
# Specify Custom Column Names for Transposed DataFrame
custom_column_names = ["Row1", "Row2", "Row3", "Row4", "Row5"]
# Transpose DataFrame with Custom Column Names
transposed_df = df.transpose(include_header=True, column_names=custom_column_names)
print("Transposed DataFrame with Custom Column Names:\n", transposed_df)
# Output:
# Transposed DataFrame with Custom Column Names:
# shape: (4, 6)
┌──────────┬────────┬─────────┬────────┬────────┬────────┐
│ column ┆ Row1 ┆ Row2 ┆ Row3 ┆ Row4 ┆ Row5 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ str ┆ str │
╞══════════╪════════╪═════════╪════════╪════════╪════════╡
│ Courses ┆ Spark ┆ PySpark ┆ Hadoop ┆ Python ┆ Pandas │
│ Fee ┆ 22000 ┆ 25000 ┆ 23000 ┆ 24000 ┆ 26000 │
│ Duration ┆ 30days ┆ 50days ┆ 35days ┆ 40days ┆ 35days │
│ Discount ┆ 1000 ┆ 2300 ┆ 1000 ┆ 1200 ┆ 2500 │
└──────────┴────────┴─────────┴────────┴────────┴────────┘
In the above example, The column_names
parameter allows you to customize the column names for the transposed DataFrame. This is helpful when you want to assign more meaningful names to the transposed data. The number of custom column names must correspond to the number of rows in the original DataFrame.
Transpose DataFrame without index
To transpose a DataFrame without the index in Polars, you need to ensure that the index is not included in the transposed DataFrame. By default, when you use the transpose()
method, it includes the headers as the first row, but the index is preserved as the first column.
# Transpose the DataFrame without including index
transposed_df = df.transpose(include_header=False)
# Drop the first row (which originally was the index)
transposed_df = transposed_df[1:, :]
print("Transposed DataFrame Without Index:\n", transposed_df)
# Output:
# Transposed DataFrame Without Index:
# shape: (3, 5)
┌──────────┬──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 ┆ column_4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ str │
╞══════════╪══════════╪══════════╪══════════╪══════════╡
│ 22000 ┆ 25000 ┆ 23000 ┆ 24000 ┆ 26000 │
│ 30days ┆ 50days ┆ 35days ┆ 40days ┆ 35days │
│ 1000 ┆ 2300 ┆ 1000 ┆ 1200 ┆ 2500 │
└──────────┴──────────┴──────────┴──────────┴──────────┘
Here,
transpose(include_header=False)
: This transposes the DataFrame, but excludes the header (which would normally be the column names).transposed_df[1:, :]
: This removes the first row from the transposed DataFrame, which was originally the index in the original DataFrame.
Transpose with Headers and Custom Column Names
To transpose a Polars DataFrame with custom headers and column names, you can use the transpose()
method and specify the desired options.
# Transpose with Headers and Custom Column Names
transposed_df = df.transpose(include_header=True, header_name="Technologies", column_names=["a", "b", "c", "d", "e"])
print("Transposed with Headers and Custom Column Names:\n", transposed_df)
# Output:
# Transposed with Headers and Custom Column Names:
# shape: (4, 6)
┌──────────────┬────────┬─────────┬────────┬────────┬────────┐
│ Technologies ┆ a ┆ b ┆ c ┆ d ┆ e │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str ┆ str ┆ str ┆ str │
╞══════════════╪════════╪═════════╪════════╪════════╪════════╡
│ Courses ┆ Spark ┆ PySpark ┆ Hadoop ┆ Python ┆ Pandas │
│ Fee ┆ 22000 ┆ 25000 ┆ 23000 ┆ 24000 ┆ 26000 │
│ Duration ┆ 30days ┆ 50days ┆ 35days ┆ 40days ┆ 35days │
│ Discount ┆ 1000 ┆ 2300 ┆ 1000 ┆ 1200 ┆ 2500 │
└──────────────┴────────┴─────────┴────────┴────────┴────────┘
Here,
include_header=True
: This includes the original column names as the first row in the transposed DataFrame.header_name
: You can specify a custom name for the first column (header) in the transposed DataFrame.column_names=["a", "b", "c", "d", "e"]
: This allows you to set custom names for the columns of the transposed DataFrame.
Transpose the Specified Column of Polars
Until now, we have explored how to transpose the entire DataFrame using the transpose()
function. In this example, we will focus on how to transpose a specific column of a given DataFrame using the same function. Let’s take a look at how the transposition works.
# Transpose single column of DataFrame
technologies= {'Fee' :[22000,25000,23000,24000,26000]}
df = pl.DataFrame(technologies)
print(df)
print("# DataFrame After Transpose...")
transposed_df = df.transpose()
print(transposed_df)
# Output:
# shape: (5, 1)
┌───────┐
│ Fee │
│ --- │
│ i64 │
╞═══════╡
│ 22000 │
│ 25000 │
│ 23000 │
│ 24000 │
│ 26000 │
└───────┘
# DataFrame After Transpose...
# shape: (1, 5)
┌──────────┬──────────┬──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 ┆ column_3 ┆ column_4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞══════════╪══════════╪══════════╪══════════╪══════════╡
│ 22000 ┆ 25000 ┆ 23000 ┆ 24000 ┆ 26000 │
└──────────┴──────────┴──────────┴──────────┴──────────┘
Conclusion
In this article, I have explained the Polars transpose()
function and demonstrated how its syntax and parameters can be used to transpose a given DataFrame in various ways. By utilizing parameters such as include_header
, header_name
, and column_names
, you can control how the original data is represented in the transposed form, making it more flexible for analysis and easier to work with according to your needs.
Happy learning!!
Related Articles
- How to drop a column using Polars
- How to Drop Row in Polars
- Polars DataFrame select() Method
- Polars Cast Multiple Columns
- Polars DataFrame.sort() Method
- Convert Polars Cast String to Float
- Polars DataFrame.rename() Method
- Add New Columns to Polars DataFrame
- Polars DataFrame.unique() Function
- Polars DataFrame.pivot() Explained with Examples