You can use DataFrame.reindex()
to change the order of pandas DataFrame columns, In this article, I will explain how to change the order of DataFrame columns in pandas and how to sort columns in alphabetical order. One easy way to re-arrange columns would be to reassign the same DataFrame with the order of the columns changed, this is similar to selecting the DataFrame with the desired order and assign it to another DataFrame.
Key Points –
- The order of DataFrame columns can significantly impact data analysis and visualization.
- Pandas offers multiple methods to change the order of DataFrame columns, such as direct indexing and the
DataFrame.reindex()
method. - Reordering columns can enhance readability and facilitate downstream data processing tasks.
- Ensure consistency in column ordering across operations and analyses to maintain clarity and reproducibility.
- Choosing a logical and intuitive order for columns can streamline data exploration and manipulation workflows.
1. Create a DataFrame with a Dictionary of Lists
Now, let’s create a DataFrame with a few rows and columns to explain changing the column order with examples. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create a DataFrame with a Dictionary of Lists
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30days', '40days' ,'35days', '40days', '60days', '50days', '55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
1 PySpark 25000 40days 2300
2 Hadoop 26000 35days 1500
3 Python 22000 40days 1200
4 pandas 24000 60days 2500
5 Oracle 21000 50days 2100
6 Java 22000 55days 2000
2. Change Order of Columns in Pandas DataFrame
You can change/rearrange the order of the DataFrame columns in any way you want by specifying the columns in a list to df[]
, for example df[['Discount',"Fee","Courses","Duration"]]
.
# Using double brackets to change column
df = pd.DataFrame(technologies)
df2 = df[['Discount',"Fee","Courses","Duration"]]
print(df2)
Yields below output.
# Output:
Discount Fee Courses Duration
0 1000 20000 Spark 30days
1 2300 25000 PySpark 40days
2 1500 26000 Hadoop 35days
3 1200 22000 Python 40days
4 2500 24000 pandas 60days
5 2100 21000 Oracle 50days
6 2000 22000 Java 55days
This code will output the DataFrame with the columns reordered according to your specification. You can replace the column names with the order you desire.
3. Change Columns Order Using DataFrame.reindex()
Alternatively, you can also change the order of columns in a pandas DataFrame using the DataFrame.reindex()
method. Use df.reindex(columns=change_column)
with a list of columns in the desired order as change_column
to reorder the columns.
# Using DataFrame.reindex() to change columns order
change_column = ['Courses','Duration','Fee','Discount']
df = df.reindex(columns=change_column)
print(df)
# You can also try
df = df.reindex(['Courses','Duration','Fee','Discount'], axis=1)
print(df)
This program will output the DataFrame with the columns reordered according to your specification using the reindex()
method. You can replace the change_column
list with the order you desire.
# Output:
Courses Duration Fee Discount
0 Spark 30days 20000 1000
1 PySpark 40days 25000 2300
2 Hadoop 35days 26000 1500
3 Python 40days 22000 1200
4 pandas 60days 24000 2500
5 Oracle 50days 21000 2100
6 Java 55days 22000 2000
4. Reorder DataFrame Columns in Sorted Order
You can get the pandas DataFrame column names as a list using df.columns
, use sorted()
method to sort the columns and send the sorted columns to DataFrame.reindex()
method get a DataFrame with sort ordered columns
# Change sorted order columns
df = df.reindex(sorted(df.columns), axis=1)
print(df)
# Reorder DataFrame column in sorted order
df = df.reindex(columns=sorted(df.columns))
print(df)
Yields below output.
# Output:
Courses Discount Duration Fee
0 Spark 1000 30days 20000
1 PySpark 2300 40days 25000
2 Hadoop 1500 35days 26000
3 Python 1200 40days 22000
4 pandas 2500 60days 24000
5 Oracle 2100 50days 21000
6 Java 2000 55days 22000
5. Using DataFrame Constructor
You can also use pd.DataFrame(df,columns=['Courses','Discount','Duration','Fee'])
to rearrange the order of columns from the existing DataFrame. Consider the existing DataFrame as df, and create a new DataFrame column.
# Using DataFrame constructor
df = pd.DataFrame(df, columns=['Courses','Discount','Duration','Fee'])
print(df)
In our case yields the same output as above.
6. Pandas Reorder the Columns
Use df=df.columns.tolist()
to rearrange the list anyway you want to reorder the pandas DataFrame column. For instance, df2=df[-1:]+df[:-1]
method.
df = df.columns.tolist()
# Rearrange the list any way you want
df2 = df[-1:] + df[:-1]
print(df2)
Yields below output.
# Output:
'Discount', 'Courses', 'Fee', 'Duration']
7. Create New List Column in the Desired Order
You need to create a new list of your columns in the desired order, then use df[['Duration']+[col for col in df.columns if col!='Duration']]
to rearrange the columns in this new order.
# Using desired order to change column
df2 = df[ ['Duration'] + [ col for col in df.columns if col != 'Duration']]
print(df2)
Yields below output.
# Output:
Duration Courses Fee Discount
0 30days Spark 20000 1000
1 40days PySpark 25000 2300
2 35days Hadoop 26000 1500
3 40days Python 22000 1200
4 60days pandas 24000 2500
5 50days Oracle 21000 2100
6 55days Java 22000 2000
You can also use [df.columns[-2]]+[col for col in df if col!=df.columns[-2]]
to the last column (indicated by -2) is inserted as the first column.
df2 = [df.columns[-2]] + [col for col in df if col != df.columns[-2]]
print(df2)
Yields below output.
# Output:
['Duration', 'Courses', 'Fee', 'Discount']
8. Complete Example
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30days', '40days' ,'35days', '40days', '60days', '50days', '55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Using double brackets to change columns
df = pd.DataFrame(technologies)
df2 = df[['Discount',"Fee","Courses","Duration"]]
print(df2)
# Using Pandas.DataFrame.list(zip())
df =pd.DataFrame(list(zip(c1,c2,c3,c4)))
df.columns =["Courses","Fee","Duration","Discount"]
# Altering the DataFrame
df2 = df[["Courses","Fee","Discount","Duration"]]
print(df2)
# Using DataFrame.reindex() to change columns order
change_column = ['Courses','Duration','Fee','Discount']
df = df.reindex(columns=change_column)
print(df)
# Change order of columns
df = df.reindex(['Courses','Duration','Fee','Discount'], axis=1)
print(df)
# Change sorted order columns
df = df.reindex(sorted(df.columns), axis=1)
print(df)
# Reorder DataFrame column in sorted order
df = df.reindex(columns=sorted(df.columns))
print(df)
# Using DataFrame constructor
df = pd.DataFrame(df, columns=['Courses','Discount','Duration','Fee'])
print(df)
df = df.columns.tolist()
# Rearrange the list any way you want
df2 = df[-1:] + df[:-1]
print(df2)
# Using desired order to change column
df2 = df[ ['Duration'] + [ col for col in df.columns if col != 'Duration']]
print(df2)
Frequently Asked Questions on Change the Order of DataFrame Columns
You can create a new DataFrame with the desired column order without modifying the original DataFrame. This can be achieved by reassigning the DataFrame with the reordered columns or by using the reindex()
method.
The best way depends on personal preference and specific requirements. Both direct indexing and the reindex()
method are commonly used and efficient ways to reorder columns in a DataFrame. Choose the method that best fits your coding style and workflow.
Changing the order of columns does not affect the data itself. It only changes the way the data is presented within the DataFrame. The values in each row remain associated with their respective column labels, regardless of the column order.
You can reorder columns based on specific criteria. For example, you can use sorting functions or conditional statements to reorder columns alphabetically or based on data type. However, keep in mind that this may require additional processing steps.
There is no inherent limit to the number of columns you can reorder in a DataFrame. You can reorder as many columns as needed based on your data analysis requirements. However, keep in mind memory limitations and computational efficiency when working with large datasets.
Conclusion
In this article, you have learned how to change the order of DataFrame columns in pandas using DataFrame.reindex()
, DataFrame construction and referring indexes. Also, learned how to sort DataFrame columns with examples.
Happy Learning !!
Related Articles
- How to Change Column Data Type On Pandas DataFrame
- Pandas groupby() and sum() With Examples
- Install pandas on Windows Step-by-Step
- Pandas apply() Return Multiple Columns
- Add New Column to Existing Pandas DataFrame
- Different Ways to Iterate Over Rows in Pandas DataFrame
- Pandas Set Order of Columns in DataFrame
- Add an Empty Column to a Pandas DataFrame
- How to Sort Multiple Columns in Pandas DataFrame
- How to Change Column Name in Pandas
- Pandas Normalize Columns of DataFrame
- Pandas Get First Row Value of a Given Column