Pandas Rename Column with Examples

pandas DataFrame.rename() function is used to rename the single column name, multiple columns, by index position, in place, with a list, with a dict and all columns e.t.c. We are often required to change the column name of the DataFrame before we perform any operations; in fact, rename() is one of the most searched and used functions of the Pandas.

The good thing about this function is it provides a way to rename a specific single column.

In this pandas article, You will learn several ways of how to rename a column name of the DataFrame with examples by using functions like DataFrame.rename(), DataFrame.set_axis(), DataFrame.add_prefix(), DataFrame.add_suffix() and more.

Related: 10 Ways to Select DataFrame Rows Based on Column Values

1. Quick Examples Rename Columns of DataFrame

If you are in a hurry, below are some of the quick examples of how to rename a column name.


## Quick Examples of Renaming DataFrame Columns ##

# Rename columns with list.
df.columns=['A','B','C']

# Rename column name by index. This changes 3rd column 
df.columns.values[2] = "C"

# Rename Column Names using rename() method
df2 = df.rename({'a': 'A', 'b': 'B'}, axis=1)
df2 = df.rename({'a': 'A', 'b': 'B'}, axis='columns')
df2 = df.rename(columns={'a': 'A', 'b': 'B'})

# Rename columns inplace (self DataFrame)
df.rename(columns={'a': 'A', 'b': 'B'}, inplace = True)

# Rename using lambda function
df.rename(columns=lambda x: x[1:], inplace=True)

# Rename with error. When x not present, it thorows error.
df.rename(columns = {'x':'X'}, errors = "raise")

#Rename all columns using set_axis()
df2=df.set_axis(['A','B','C'], axis=1)

Now, let’s create a DataFrame with a few Rows and Columns and execute these examples and validate the results.

2. pandas DataFrame.rename() Syntax

Following is the syntax of the pandas.DataFrame.rename() method, this returns either DataFrame or None. By default returns DataFrame after updating columns. When use inplace=True it updates the existing DataFrame inplace (self) and returns None.


#DataFrame.rename() Syntax
DataFrame.rename(mapper=None, index=None, columns=None, axis=None, 
       copy=True, inplace=False, level=None, errors='ignore')

Following are the parameters.

  • mapper – dictionary or function to rename column and index.
  • index – dictionary or function to rename index. When using with axis param, it should be (mapper, axis=0) which is equivalent to index=mapper.
  • columns – dictionary or function to rename column. When using with axis param, it should be (mapper, axis=0) which is equivalent to column=mapper.
  • axis – Value can be either 0 or index | 1 or columns. Default set to ‘0’.
  • copy – Copies the data as-well. Default set to True.
  • inplace – Used to specify the the DataFrame refered to be update. Default to False. When used True, copy property will be ignored.
  • level – Used with MultiIndex. Takes Integer value. Default set to None.
  • errors – Take values raise or ignore. if ‘raise’ used, raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If ‘ignore’ used, existing keys will be renamed and extra keys will be ignored. Default set to ignore.

Let’s create a Pandas DataFrame with a dictionary of lists, pandas DataFrame columns names Courses, Fee, Duration.


import pandas as pd
technologies = ({
  'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
  'Fee' :[20000,25000,26000,22000,24000,21000,22000],
  'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df.columns)

Yields below output.


Index(['Courses', 'Fee', 'Duration'], dtype='object')

3. Rename Column Name

In order to rename a single column name on pandas DataFrame, you can use column={} parameter with the dictionary mapping of the old name and a new name. Note that when you use column para, you cannot explicitly use axis param.

pandas DataFrame.rename() accepts a dict(dictionary) as a param for columns you wanted to rename, so you just pass a dict with key-value pair; the key is an existing column you would like to rename and value would be your preferred column name.


# Rename a Single Column 
df2=df.rename(columns = {'Courses':'Courses_List'})
print(df2.columns)

Yields below output. As you see it rename column from Courses to Courses_List.


Index(['Courses_List', 'Fee', 'Duration'], dtype='object')

Alternatively, you can also write the above statement by using axis=1 or axis='columns'.


# Alternatively you can write above using axis
df2=df.rename({'Courses':'Courses_List'}, axis=1)
df2=df.rename({'Courses':'Courses_List'}, axis='columns')

In order to change columns on the existing DataFrame without copying to the new DataFrame, you have to use inplace=True.


# Replace existing DataFrame (inplace). This returns None.
df.rename({'Courses':'Courses_List'}, axis='columns', inplace=True)
print(df.columns)

4. Rename Multiple Columns

You can also use the same approach to rename multiple columns of Pandas DataFrame. All you need to specify multiple columns you wanted to rename in a dictionary mapping.


# Rename multiple columns
df.rename(columns = {'Courses':'Courses_List','Fee':'Courses_Fee', 
   'Duration':'Courses_Duration'}, inplace = True)
print(df.columns)

Yields below output. As you see it renames multiple columns.


Index(['Courses_List', 'Courses_Fee', 'Courses_Duration'], dtype='object')

5. Rename Column by index or position

Rename column by Index/position can by done by using df.columns.values[index]='value' in pandas DataFrame. Index and position can be used interchangingly to access column at a given position. By using this you can raname first column, last column e.t.c.

As you have seen above df.columns returns a column names as a Series and df.columns.values gets column names as list, now you can set the specific index/position with a new value. The below example updates column Courses to Courses_Duration at index 3. Note that index starts from zero.


# pandas rename column by index
df.columns.values[2] = "Courses_Duration"

6. Rename Columns with List

python list can be used to rename all columns in pandas DataFrame, when you doesn’t want to rename any specific column then use the same name in the list. The length of the list should be same as the number of columns in the DataFrame. Otherwise, an error occurs.


#Rename columns wiht list
column_names = ['Courses','Fee','Duration']
df.columns = column_names
print(df.columns)

Yields below output.


Index(['col_Courses', 'col_Fee', 'col_Duration'], dtype='object')

7. Rename Columns in place

By defaults rename() function returns a new DataFrame after updating the column names, you can change this behaviour and rename in place by using inplace=True pram.


# Rename multiple columns
df.rename(columns = {'Courses':'Courses_List','Fee':'Courses_Fee', 
   'Duration':'Courses_Duration'}, inplace = True)
print(df.columns)

This renames column names on DataFrame in place and returns None type.

8. Rename All Columns by adding Suffix or Prefix

Sometimes you may need to add a string text to the suffix or prefix of all column names. You can do this by getting all columns one by one in a loop and adding a suffix or prefix string.


# Rename All Column Names by adding Suffix or Prefix
df.columns = column_names
df.columns = ['col_'+str(col) for col in df.columns]

You can also use pandas.DataFrame.add_prefix() and pandas.DataFrame.add_suffix() to add prefix and suffix respectively to the pandas DataFrame column names.


# Add prefix to the column names
df2=df.add_prefix('col_')
print(df2.columns)

# Add suffix to the column names
df2=df.add_suffix('_col')
print(df2.columns))

Yields below output.


Index(['col_Courses', 'col_Fee', 'col_Duration'], dtype='object')
Index(['Courses_col', 'Fee_col', 'Duration_col'], dtype='object')

9. Rename Column using Lambda Function

You can also change the pandas column name using the lambda function, This gives us more in control and apply custom function. The below examples add’s ‘col_’ string to all column names. You can also try removing spaces from columns e.t.c


# Rename using Lambda function
df.rename(columns=lambda x: 'col_'+x, inplace=True)

10. Rename or Convert All Columns to Lower or Upper Case

When column names are mixed with lower and upper case, it would be best practice to convert/update all columns names to either lower or upper case.


# Change to all lower case
df = pd.DataFrame(technologies)
df2=df.rename(str.lower, axis='columns')
print(df2.columns)

# Change to all upper case
df = pd.DataFrame(technologies)
df2=df.rename(str.upper, axis='columns')
print(df2.columns)

Yields below output.


Index(['courses', 'fee', 'duration'], dtype='object')
Index(['COURSES', 'FEE', 'DURATION'], dtype='object')

11. Change Column Names Using DataFrame.set_axis()

By using DataFrame.set_axis() you can also change the column names. Note that with set_axis() you need to assign all column names. This updates the DataFrame with a new set of column names. set_axis() also used to rename pandas DataFrame Index


# Change column name using set_axis()
df.set_axis(['Courses_List', 'Course_Fee', 'Course_Duration'], axis=1, inplace=True)
print(df.columns)

12. Using String replace()

Pandas String.replace() a method is used to replace a string, series, dictionary, list, number, regex, etc. from a DataFrame. This is a very rich function as it has many variations. If you have used this syntax: df.columns.str.replace("Fee","Fee Cost"), it replaces 'Fee' column with 'Fee_Cost'.


# Change column name using String.replace()
df.columns = df.columns.str.replace("Fee","Fee_Cost")
print(df.columns)

Yields below output.


Index(['Courses_List', 'Course_Fee_Cost', 'Course_Duration'], dtype='object')

To replace all column names.


# Rename all column names
df.columns = df.columns.str.replace("_"," ")
print(df.columns)

Yields below output.


Index(['Courses List', 'Course Fee Cost', 'Course Duration'], dtype='object')

13. Raise Error when Column Not Found

By default when rename column label is not found on Pandas DataFrame, rename() method just ignores the column. In case you wanted to throw an error when a column is not found, use errors = "raise".


# Throw Error when Rename column doesn't exists.
df.rename(columns = {'Cour':'Courses_List'}, errors = "raise")

Yields error message raise KeyError("{} not found in axis".format(missing_labels)).


raise KeyError("{} not found in axis".format(missing_labels))
KeyError: "['Cour'] not found in axis"

14. Rename Only If Column Exists.

This example changes the Courses column to Courses_List and it’s doesn’t update Fees as we don’t have the Fees column. Note that even though the Fees column does not exist it didn’t raise errors even when we used errors="raise".


# Change column only if column exists.
df = pd.DataFrame(technologies)
d={'Courses':'Courses_List','Fees':'Courses_fees'}
df.rename(columns={k: v for k, v in d.items() if k in df.columns}, inplace=True,errors = "raise")
print(df.columns)

Conclusion

In this article, you have learned several ways to rename single column name by index/position, multile columns, wiht list and dict using the DataFrame rename(), set_axis() functions. And, changing the column name by adding prefix or suffix using add_prefix() & add_suffix() functions. Also learned renaming columns using user-defined functions and finally converting columns to lower and upper case. Hope you like it !!

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas Rename Column with Examples