• Post author:
  • Post category:Polars
  • Post last modified:March 10, 2025
  • Reading time:11 mins read
You are currently viewing Reorder Columns in a Specific Order Using Polars

In Polars, you can use the select() function to reorder columns in a specific order, allowing you to explicitly define the desired column sequence for your DataFrame. Alternatively, you can rearrange columns using df[column_order], where column_order is a list of column names in the desired order. In this article, I will explain how to reorder columns in a specific order using Polars.

Advertisements

Key Points –

  • Polars allows reordering columns using the select() method, where you explicitly define the desired order.
  • Column indexing (df[:, [columns]]) can also be used to manually rearrange columns in a specific order.
  • Dynamic reordering can be achieved by extracting column names using df.columns and arranging them programmatically.
  • Sorting column names alphabetically can be done using sorted(df.columns) to maintain consistency.
  • Reordering based on data types is possible by grouping columns according to their types using df.schema.
  • Using with_columns() can be helpful when adding new columns while maintaining a specific order.
  • Moving a specific column to the first position can be achieved by separating it from the rest and reconstructing the column order.
  • Reordering does not modify the original DataFrame, but rather creates a new one with the specified column order.

Usage of Reorder Columns in a Specific Order

In Polars, you can reorder columns using the select() function by specifying the column names in the desired order. Simply provide a list of column names in the preferred order, giving you full control over the DataFrame’s column arrangement.

To run some examples of reorder columns in a specific order using polars, let’s create a Polars DataFrame.


import polars as pl

technologies= {
    'Courses':["Spark", "PySpark", "Hadoop", "Python"],
    'Fees' :[22000, 25000, 23000, 24000],
    'Discount':[1000, 2300, 1000, 1200],
    'Duration':['35days', '60days', '30days', '45days']
          }

df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars reorder columns

You can reorder columns in a specific order in polars using the select() method. This allows you to control the layout of your DataFrame, making it easier to work with.


# Reorder columns using select()
df2 = df.select(["Courses", "Duration", "Fees", "Discount"])
print("Reordered DataFrame:\n", df2)

Here,

  • Define the new column order, ["Courses", "Duration", "Fees", "Discount"]
  • Use the select() method to reorder the columns and print the DataFrame to view the updated arrangement.
polars reorder columns

Reordering Columns Dynamically Using select() Method

When working with Polars, you may not always know the column names in advance. Instead of manually specifying the order, you can dynamically rearrange them based on a desired logic. This is useful when dealing with large datasets or unknown column structures.


# Define dynamic column order
first_column = "Courses"
remaining_columns = [col for col in df.columns if col != first_column]

# Reorder using select()
df2 = df.select([first_column] + remaining_columns)
print("Reordered DataFrame:\n", df2)

# Output:
# Reordered DataFrame:
# shape: (4, 4)
┌─────────┬───────┬──────────┬──────────┐
│ Courses ┆ Fees  ┆ Discount ┆ Duration │
│ ---     ┆ ---   ┆ ---      ┆ ---      │
│ str     ┆ i64   ┆ i64      ┆ str      │
╞═════════╪═══════╪══════════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     ┆ 35days   │
│ PySpark ┆ 25000 ┆ 2300     ┆ 60days   │
│ Hadoop  ┆ 23000 ┆ 1000     ┆ 30days   │
│ Python  ┆ 24000 ┆ 1200     ┆ 45days   │
└─────────┴───────┴──────────┴──────────┘

Here,

  • df.columns fetches all column names. The remaining_columns list is created by excluding the "Courses" column using list comprehension.
  • select([first_column] + remaining_columns) ensures "Courses" is first.
  • This approach works dynamically, even if the column names change.

Reordering Columns Using with_columns() Method

Unlike select(), which explicitly defines the new column order, the with_columns() method in Polars does not reorder columns directly. Instead, it is used to add or modify columns. However, we can use it to reorder columns indirectly by reassigning them in a specific order.


# Reorder columns using with_columns()
df2 = df.with_columns([df["Courses"], df["Duration"], df["Discount"], df["Fees"]])
print("Reordered DataFrame:\n", df2)

# Output:
# # Reorder columns using with_columns()
df2 = df.with_columns([df["Courses"], df["Duration"], df["Fees"], df["Discount"]])
print("Reordered DataFrame:\n", df2)

# Output:
# Reordered DataFrame:
# shape: (4, 4)
┌─────────┬─────────┬───────┬─────────┐
│ Courses │ Duration│ Fees  │ Discount│
│ ---     │ ---     │ ---   │ ---     │
│ str     │ str     │ i64   │ i64     │
├─────────┼─────────┼───────┼─────────┤
│ Spark   │ 35days  │ 22000 │ 1000    │
│ PySpark │ 60days  │ 25000 │ 2300    │
│ Hadoop  │ 30days  │ 23000 │ 1000    │
│ Python  │ 45days  │ 24000 │ 1200    │
└─────────┴─────────┴───────┴─────────┘

Here,

  • with_columns() is typically used to modify or add columns, but here, we reassign columns in the desired order.
  • Each column is explicitly mentioned, ensuring the correct order.
  • The new DataFrame preserves the same data but with reordered columns.

Reordering Columns Using Column Indexing

You can reorder columns using column indexing by specifying the new order in Polars using indices. This method is useful when you don’t want to refer to column names explicitly.


# Reorder columns using column indexing
df2 = df[:, [0, 3, 1, 2]]  # Selecting columns using index positions
print("Reordered DataFrame:\n", df2)

Here,

  • df[:, [0, 3, 1, 2]] selects all rows, while [0, 3, 1, 2] specifies the column order by index positions. Here, index 0 corresponds to "Courses", 3 to "Duration", 1 to "Fees", and 2 to "Discount".
  • The column order is modified without using column names.

You can also reorder columns by explicitly specifying the desired order in Polars using column names inside indexing (df[:, []]). This method allows you to rearrange the DataFrame efficiently.


# Reorder columns
df2 = df[:, ['Courses', 'Duration', 'Fees', 'Discount']]
print("Reordered DataFrame:\n", df2)

# Output:
# Reordered DataFrame:
# shape: (4, 4)
┌─────────┬──────────┬───────┬──────────┐
│ Courses ┆ Duration ┆ Fees  ┆ Discount │
│ ---     ┆ ---      ┆ ---   ┆ ---      │
│ str     ┆ str      ┆ i64   ┆ i64      │
╞═════════╪══════════╪═══════╪══════════╡
│ Spark   ┆ 35days   ┆ 22000 ┆ 1000     │
│ PySpark ┆ 60days   ┆ 25000 ┆ 2300     │
│ Hadoop  ┆ 30days   ┆ 23000 ┆ 1000     │
│ Python  ┆ 45days   ┆ 24000 ┆ 1200     │
└─────────┴──────────┴───────┴──────────┘

Sorting Columns Alphabetically

If you want to sort the column names alphabetically, you can achieve this dynamically using sorted(df.columns). This is useful when working with datasets where column names may not have a predefined order.


# Sort columns alphabetically
df2 = df.select(sorted(df.columns))
print("DataFrame with Sorted Columns:\n", df2)

# Output:
# DataFrame with Sorted Columns:
# shape: (4, 4)
┌─────────┬──────────┬──────────┬───────┐
│ Courses ┆ Discount ┆ Duration ┆ Fees  │
│ ---     ┆ ---      ┆ ---      ┆ ---   │
│ str     ┆ i64      ┆ str      ┆ i64   │
╞═════════╪══════════╪══════════╪═══════╡
│ Spark   ┆ 1000     ┆ 35days   ┆ 22000 │
│ PySpark ┆ 2300     ┆ 60days   ┆ 25000 │
│ Hadoop  ┆ 1000     ┆ 30days   ┆ 23000 │
│ Python  ┆ 1200     ┆ 45days   ┆ 24000 │
└─────────┴──────────┴──────────┴───────┘

Here,

  • df[:, ['Courses', 'Duration', 'Fees', 'Discount']] selects and reorders the columns according to the specified sequence.
  • df.select(sorted(df.columns)) rearranges the columns based on alphabetical order.
  • Works with any number of columns, making it useful for large datasets.

Conclusion

In conclusion, reordering columns in Polars is a straightforward yet powerful operation that helps in organizing data efficiently. Whether using select(), column indexing, or dynamic sorting, Polars provides multiple ways to achieve this. Sorting columns alphabetically, reordering based on data types, or prioritizing key columns improves readability, data processing, and compatibility with external systems.

Happy Learning!!

References