• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:17 mins read
You are currently viewing How to Change Column Name in Pandas

You can change the column name of Pandas DataFrame by using the DataFrame.rename() method and the DataFrame.columns() method. In this article, I will explain how to change the given column name of Pandas DataFrame with examples.

Advertisements
  • Use the pandas DataFrame.rename() function to modify specific column names.
  • Set the DataFrame columns attribute to your new list of column names.

Key Points –

  • Use the rename() function in pandas to change column names.
  • Specify the old column name and the desired new column name within the columns parameter of the rename() function.
  • To apply changes directly to the original DataFrame, set the inplace parameter to True.
  • Column renaming is useful for improving clarity, consistency, and relevance of column names in your DataFrame.
  • Renaming columns facilitates better data manipulation, analysis, and visualization by providing more descriptive column labels.

1. Quick Examples of Change Column Name

If, you are in hurry below are some quick examples to change specific column names on DataFrame.


# Quick examples of change column name

# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})

# Using rename() function.
df.rename(columns = {'Fee': 'Fees'}, inplace = True)  

# Renaming Multiple columns.
df.rename({'Courses': 'Course_ Name','Fee': 'CourseFee', 'Duration': 'CourseDuration'}, 
          axis = "columns", inplace = True)  

# Changing Column Attribute.
df.columns.values[0] = 'Course'

# Errors parameter to 'raise' when column not present.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise') 

Now, let’s create a Pandas DataFrame with a few rows and columns and execute some examples and validate results. Our DataFrame contains column names Courses, Fee and Discount.


# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","Python","PySpark"],
    'Fee' :[22000,25000,23000,24000,26000],
    'Duration':['30days','50days','30days','35days','60days']
          }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   35days
4  PySpark  26000   60days

2. Using DataFrame.rename() Method

The pandas DataFrame.rename() function is a quite versatile function used not only to rename column names but also row indices. The good thing about this function is that you can rename specific columns. The syntax to change column names using the rename function.


# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})

 The rename() function returns a new DataFrame with renamed axis labels (i.e. the renamed columns or rows depending on usage). To modify the DataFrame in-place set the argument inplace to True.


# Using rename() function
df.rename(columns = {'Fee': 'Fees'}, inplace = True)       
print(df)

In this example, the column ‘Fee’ is renamed to ‘Fees’ using the rename() function with the columns parameter specifying the mapping of old column names to new column names. Setting inplace=True ensures that the changes are made to the original DataFrame rather than creating a new one. This example yields the below output.


# Output:
   Courses   Fees Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   35days
4  PySpark  26000   60days

3. Changing Column by DataFrame.columns Method

Similarly, you can also update the DataFrame column by setting its columns attribute to your new list of columns. Access the index to change the specified column name.


# Changing Column Attribute.
df.columns.values[0] = 'Course'
print(df)

Here,

  • df.columns.values retrieves the array of column names from the DataFrame.
  • df.columns.values[0] accesses the first element of the array, representing the name of the first column.
  • 'Course' is assigned to replace the existing name of the first column. This example yields the below output.

# Output:
    Course    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   35days
4  PySpark  26000   60days

4. Update All Column Names

In this the updated column names as a list to rename the columns. The length of the list we provide should be the same as the number of columns in the DataFrame. Otherwise, an error occurs.

If you have a list of column names stored in the variable technologies, you can directly assign this list to the DataFrame.columns attribute to update all column names in your DataFrame.


# Using new list of column names
df.columns = technologies
print(df)

This code will update all column names in the DataFrame to the names specified in the technologies list. Make sure that the length of the technologies list matches the number of columns in the DataFrame to avoid any errors. This example yields the below output.


# Output:
   Courses    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   35days
4  PySpark  26000   60days

5. Using DataFrame.column.str.replace() Method

If the number of columns in the Pandas DataFrame is huge, say nearly 100, and we want to replace the space in all the column names (if it exists) by an underscore and it is not easy to provide a list or dictionary to rename all the columns. Then we use the following method-


# Using DataFrame.column.str.replace() Method.
df2 = df.columns.str.replace(' ', '_')
print(df2)

 Yields below output.


# Output:
Index(['Courses', 'Fee', 'Duration'], dtype='object')

6. Raise Error When Column not Exist

When the column you wanted to change doesn’t exist, no error is raised by default. use raise parameter to raise an error.


# Errors parameter to 'raise'.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise') 
print(df2)

This program will create a DataFrame df and then rename the ‘Courses’ column to ‘EmpCourses’ using the rename() function with the errors='raise' parameter, which will raise a KeyError if the specified column name (‘Courses’) doesn’t exist in the DataFrame. Yields output same as above.

7. Complete Examples of Change Column Name of Pandas


# Below are complete examples.
# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","Python","PySpark"],
    'Fee' :[22000,25000,23000,24000,26000],
    'Duration':['30days','50days','30days','35days','60days']
          }
df = pd.DataFrame(technologies)
print(df)

# Syntax to change column name using rename() function.
df.rename(columns={"OldName":"NewName"})
print(df)

# Using rename() function.
df.rename(columns = {'Fee': 'Fees'}, inplace = True)  
print(df)

# Renaming Multiple columns.
df.rename({'Courses': 'Course_ Name','Fee': 'CourseFee', 'Duration': 'CourseDuration'}, 
          axis = "columns", inplace = True)  
print(df)

# Change column name using rename() and lambda function.
df2 = df.rename(columns = lambda x: x+':')
print(df2)

# Changing Column Attribute.
df.columns.values[0] = 'Course'
print(df)

# Using new list of column names
df.columns = technologies
print(df)

# Using DataFrame.column.str.replace() Method.
df2 = df.columns.str.replace(' ', '_')
print(df2)

# NO Error is raised.
df2 = df.rename(columns={'Courses': 'EmpCourses'})
print(df2)

# Errors parameter to 'raise'.
df2 = df.rename(columns={'Courses': 'EmpCourses'},errors='raise') 
print(df2)

Frequently Asked Questions on Change Column Name in Pandas

How do I change the name of a single column in a DataFrame?

You can change the name of a single column in a DataFrame using the rename() function or by directly assigning the new name to the column using the DataFrame.columns attribute.

Can I rename multiple columns at once?

You can rename multiple columns at once in pandas using the rename() function. You can provide a dictionary where the keys are the old column names and the values are the new column names.

What happens if I try to rename a column that doesn’t exist in the DataFrame?

If you try to rename a column that doesn’t exist in the DataFrame, pandas will raise a KeyError by default. You can handle this by using error handling mechanisms like try-except blocks.

Is there a way to change all column names in a DataFrame?

You can change all column names in a DataFrame by assigning a new list of column names to the DataFrame.columns attribute.

Is there a performance difference between different methods of changing column names?

In general, the performance difference between different methods of changing column names in pandas is minimal. However, using vectorized operations or built-in functions like rename() may offer slightly better performance for large datasets compared to iterative methods.

Does changing column names affect the underlying data in the DataFrame?

Changing column names in pandas does not affect the underlying data in the DataFrame. It only changes the labels used to access the columns. The actual data in the DataFrame remains unchanged.

Conclusion

In this article, You have learned about how to change the specific column name of Pandas DataFrame by using DataFrame.rename() method and attributes DataFrame.column with some examples.

References