You can change the column name of Pandas DataFrame by using the DataFrame.rename()
method and the DataFrame.columns()
method. In this article, I will explain how to change the given column name of Pandas DataFrame with examples.
- Use the pandas DataFrame.rename() function to modify specific column names.
- Set the DataFrame
columns
attribute to your new list of column names.
Key Points –
- Use the
rename()
function in pandas to change column names. - Specify the old column name and the desired new column name within the
columns
parameter of therename()
function. - To apply changes directly to the original DataFrame, set the
inplace
parameter toTrue
. - Column renaming is useful for improving clarity, consistency, and relevance of column names in your DataFrame.
- Renaming columns facilitates better data manipulation, analysis, and visualization by providing more descriptive column labels.
1. Quick Examples of Change Column Name
If, you are in hurry below are some quick examples to change specific column names on DataFrame.
# Quick examples of change column name
# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})
# Using rename() function.
df.rename(columns = {'Fee': 'Fees'}, inplace = True)
# Renaming Multiple columns.
df.rename({'Courses': 'Course_ Name','Fee': 'CourseFee', 'Duration': 'CourseDuration'},
axis = "columns", inplace = True)
# Changing Column Attribute.
df.columns.values[0] = 'Course'
# Errors parameter to 'raise' when column not present.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise')
Now, let’s create a Pandas DataFrame with a few rows and columns and execute some examples and validate results. Our DataFrame contains column names Courses
, Fee
and Discount
.
# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days','35days','60days']
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Python 24000 35days
4 PySpark 26000 60days
2. Using DataFrame.rename() Method
The pandas DataFrame.rename() function is a quite versatile function used not only to rename column names but also row indices. The good thing about this function is that you can rename specific columns. The syntax to change column names using the rename function.
# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})
The rename()
function returns a new DataFrame with renamed axis labels (i.e. the renamed columns or rows depending on usage). To modify the DataFrame in-place set the argument inplace
to True
.
# Using rename() function
df.rename(columns = {'Fee': 'Fees'}, inplace = True)
print(df)
In this example, the column ‘Fee’ is renamed to ‘Fees’ using the rename()
function with the columns
parameter specifying the mapping of old column names to new column names. Setting inplace=True
ensures that the changes are made to the original DataFrame rather than creating a new one. This example yields the below output.
# Output:
Courses Fees Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Python 24000 35days
4 PySpark 26000 60days
3. Changing Column by DataFrame.columns Method
Similarly, you can also update the DataFrame column by setting its columns
attribute to your new list of columns. Access the index to change the specified column name.
# Changing Column Attribute.
df.columns.values[0] = 'Course'
print(df)
Here,
df.columns.values
retrieves the array of column names from the DataFrame.df.columns.values[0]
accesses the first element of the array, representing the name of the first column.'Course'
is assigned to replace the existing name of the first column. This example yields the below output.
# Output:
Course Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Python 24000 35days
4 PySpark 26000 60days
4. Update All Column Names
In this the updated column names as a list to rename the columns. The length of the list we provide should be the same as the number of columns in the DataFrame. Otherwise, an error occurs.
If you have a list of column names stored in the variable technologies
, you can directly assign this list to the DataFrame.columns
attribute to update all column names in your DataFrame.
# Using new list of column names
df.columns = technologies
print(df)
This code will update all column names in the DataFrame to the names specified in the technologies
list. Make sure that the length of the technologies
list matches the number of columns in the DataFrame to avoid any errors. This example yields the below output.
# Output:
Courses Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Python 24000 35days
4 PySpark 26000 60days
5. Using DataFrame.column.str.replace() Method
If the number of columns in the Pandas DataFrame is huge, say nearly 100, and we want to replace the space in all the column names (if it exists) by an underscore and it is not easy to provide a list or dictionary to rename all the columns. Then we use the following method-
# Using DataFrame.column.str.replace() Method.
df2 = df.columns.str.replace(' ', '_')
print(df2)
Yields below output.
# Output:
Index(['Courses', 'Fee', 'Duration'], dtype='object')
6. Raise Error When Column not Exist
When the column you wanted to change doesn’t exist, no error is raised by default. use raise parameter to raise an error.
# Errors parameter to 'raise'.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise')
print(df2)
This program will create a DataFrame df
and then rename the ‘Courses’ column to ‘EmpCourses’ using the rename()
function with the errors='raise'
parameter, which will raise a KeyError
if the specified column name (‘Courses’) doesn’t exist in the DataFrame. Yields output same as above.
7. Complete Examples of Change Column Name of Pandas
# Below are complete examples.
# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days','35days','60days']
}
df = pd.DataFrame(technologies)
print(df)
# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})
print(df)
# Using rename() function.
df.rename(columns = {'Fee': 'Fees'}, inplace = True)
print(df)
# Renaming Multiple columns.
df.rename({'Courses': 'Course_ Name','Fee': 'CourseFee', 'Duration': 'CourseDuration'},
axis = "columns", inplace = True)
print(df)
# Change column name using rename() and lambda function.
df2 = df.rename(columns = lambda x: x+':')
print(df2)
# Changing Column Attribute.
df.columns.values[0] = 'Course'
print(df)
# Using new list of column names
df.columns = technologies
print(df)
# Using DataFrame.column.str.replace() Method.
df2 = df.columns.str.replace(' ', '_')
print(df2)
# NO Error is raised.
df2 = df.rename(columns={'Courses': 'EmpCourses'})
print(df2)
# Errors parameter to 'raise'.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise')
print(df2)
Frequently Asked Questions on Change Column Name in Pandas
You can change the name of a single column in a DataFrame using the rename()
function or by directly assigning the new name to the column using the DataFrame.columns
attribute.
You can rename multiple columns at once in pandas using the rename()
function. You can provide a dictionary where the keys are the old column names and the values are the new column names.
If you try to rename a column that doesn’t exist in the DataFrame, pandas will raise a KeyError
by default. You can handle this by using error handling mechanisms like try-except blocks.
You can change all column names in a DataFrame by assigning a new list of column names to the DataFrame.columns
attribute.
In general, the performance difference between different methods of changing column names in pandas is minimal. However, using vectorized operations or built-in functions like rename()
may offer slightly better performance for large datasets compared to iterative methods.
Changing column names in pandas does not affect the underlying data in the DataFrame. It only changes the labels used to access the columns. The actual data in the DataFrame remains unchanged.
Conclusion
In this article, You have learned about how to change the specific column name of Pandas DataFrame by using DataFrame.rename()
method and attributes DataFrame.column
with some examples.
Related Articles
- Filter Rows with NAN Value from Pandas DataFrame Column
- Create Test and Train Samples from Pandas DataFrame
- How to Print Pandas DataFrame without Index
- Rename Index Values of Pandas DataFrame
- Rename Index of Pandas DataFrame
- Pandas apply() Return Multiple Columns
- Pandas Normalize Columns of DataFrame
- Append Pandas DataFrames Using for Loop
- Pandas Append Rows & Columns to Empty DataFrame
- How to Convert Pandas DataFrame to List?
- Pandas Add Column based on Another Column
- Pandas Drop Level From Multi-Level Column Index