Pandas Rename Column with Examples

Pandas DataFrame.rename() method is used to rename/change/replace column (single & multiple columns), by index, and all columns of the DataFrame. We are often required to change the column name of the DataFrame before we perform any operations; in fact, rename() is one of the most searched and used methods of the Pandas DataFrame.

In this article, we’ll explain several ways of how to rename a single column and multiple columns of the Pandas DataFrame with examples by using functions like DataFrame.rename(), DataFrame.set_axis(), DataFrame.add_prefix(), DataFrame.add_suffix() and more.

Related: 10 Ways to Select DataFrame Rows Based on Column Values

1. Quick Examples of Renaming DataFrame Column Names

If you are in a hurry, below are some of the quick examples of renaming Pandas DataFrame columns.


## Quick Examples of Renaming DataFrame Columns ##

# Assign new header by setting new column names.
df.columns=['A','B','C']

# Change column name by index. This changes 3rd column 
df.columns.values[2] = "C"

# Rename Column Names using rename() method
df2 = df.rename({'a': 'A', 'b': 'B'}, axis=1)
df2 = df.rename({'a': 'A', 'b': 'B'}, axis='columns')
df2 = df.rename(columns={'a': 'A', 'b': 'B'})

# Rename columns inplace (self DataFrame)
df.rename(columns={'a': 'A', 'b': 'B'}, inplace = True)

# Rename using lambda function
df.rename(columns=lambda x: x[1:], inplace=True)

# Rename with error. When x not present, it thorows error.
df.rename(columns = {'x':'X'}, errors = "raise")

Now, let’s create a DataFrame with a few Rows and Columns and execute some examples and validate the results.

2. Pandas DataFrame.rename() Syntax & Examples

Below is the syntax of the rename() method of the DataFrame in Pandas,

Returns either DataFrame or None. By default returns DataFrame after updating columns. When use inplace=True it updates the existing DataFrame (self) and returns None.


#DataFrame.rename() Syntax
DataFrame.rename(mapper=None, index=None, columns=None, axis=None, 
       copy=True, inplace=False, level=None, errors='ignore')
  • mapper – dictionary or function to rename column and index.
  • index – dictionary or function to rename index. When using with axis param, it should be (mapper, axis=0) which is equivalent to index=mapper.
  • columns – dictionary or function to rename column. When using with axis param, it should be (mapper, axis=0) which is equivalent to column=mapper.
  • axis – Value can be either 0 or index | 1 or columns. Default set to ‘0’.
  • copy – Copies the data as-well. Default set to True.
  • inplace – Used to specify the the DataFrame refered to be update. Default to False. When used True, copy property will be ignored.
  • level – Used with MultiIndex. Takes Integer value. Default set to None.
  • errors – Take values raise or ignore. if ‘raise’ used, raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If ‘ignore’ used, existing keys will be renamed and extra keys will be ignored. Default set to ignore.

Let’s create a Pandas DataFrame with a dictionary of lists, pandas DataFrame columns names Courses, Fee, Duration.


import pandas as pd
technologies = ({
  'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
  'Fee' :[20000,25000,26000,22000,24000,21000,22000],
  'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df.columns)

Yields below output.


Index(['Courses', 'Fee', 'Duration'], dtype='object')

3. Rename a Single Column in Pandas DataFrame

In order to rename a single column on Pandas DataFrame, we can use column={} parameter with the dictionary mapping of the old name and a new name. Note that when you use column parameters, you cannot explicitly use axis parameter.


# Rename a Single Column 
df2=df.rename(columns = {'Courses':'Courses_List'})
print(df2.columns)

Yields below output. As you see it rename column from Courses to Courses_List.


Index(['Courses_List', 'Fee', 'Duration'], dtype='object')

Alternatively, you can also write the above statement by using axis=1 or axis='columns'.


# Alternatively you can write above using axis
df2=df.rename({'Courses':'Courses_List'}, axis=1)
df2=df.rename({'Courses':'Courses_List'}, axis='columns')

In order to change columns on the existing DataFrame without copying to the new DataFrame, you have to use inplace=True.


# Replace existing DataFrame (inplace). This returns None.
df.rename({'Courses':'Courses_List'}, axis='columns', inplace=True)
print(df.columns)

4. Rename Multiple Columns in Pandas DataFrame

You can also use the same approach to rename multiple or all columns of Pandas DataFrame. All you need to specify multiple columns you wanted to rename in a dictionary mapping.


# Rename multiple columns
df.rename(columns = {'Courses':'Courses_List','Fee':'Courses_Fee', 
   'Duration':'Courses_Duration'}, inplace = True)
print(df.columns)

Yields below output. As you see it renames multiple columns.


Index(['Courses_List', 'Courses_Fee', 'Courses_Duration'], dtype='object')

5. Pandas rename column by index

In order to rename a Pandas column on a specific index, use df.columns.values[index]='value'. The below example updates column 3 (index starts from zero).


# pandas rename column by index
df.columns.values[2] = "Courses_Duration"

6. Rename All Columns by adding Suffix or Prefix to Pandas DataFrame

Sometimes you may need to add a string text to the suffix or prefix of all column names. You can do this by getting all columns one by one in a loop and adding a suffix or prefix string.


# Rename All Column Names by adding Suffix or Prefix
df.columns = ['col_'+str(col) for col in df.columns]

You can also use pandas.DataFrame.add_prefix() and pandas.DataFrame.add_suffix() to add prefix and suffix respectively to the pandas DataFrame column names.


# Add prefix to the column names
df2=df.add_prefix('col_')
print(df2.columns)

# Add suffix to the column names
df2=df.add_suffix('_col')
print(df2.columns))

Yields below output.


Index(['col_Courses', 'col_Fee', 'col_Duration'], dtype='object')
Index(['Courses_col', 'Fee_col', 'Duration_col'], dtype='object')

7. Rename using Lambda Function

You can also change the Pandas column name using the lambda function. The below examples add’s ‘col_’ string to all column names.


# Rename using Lambda function
df.rename(columns=lambda x: 'col_'+x, inplace=True)

8. Rename or Convert All Columns to Lower or Upper Case

When column names are mixed with lower and upper case and are not consistent, it would be best practice to convert/update all columns names to either lower or upper case.


# Change to all lower case
df = pd.DataFrame(technologies)
df2=df.rename(str.lower, axis='columns')
print(df2.columns)

# Change to all upper case
df = pd.DataFrame(technologies)
df2=df.rename(str.upper, axis='columns')
print(df2.columns)

Yields below output.


Index(['courses', 'fee', 'duration'], dtype='object')
Index(['COURSES', 'FEE', 'DURATION'], dtype='object')

9. Change Column Names Using DataFrame.set_axis()

By using DataFrame.set_axis() you can also change the column names. Note that with set_axis() you need to assign all column names. This updates the DataFrame with a new set of column names. set_axis() also used to rename pandas DataFrame Index


# Change column name using set_axis()
df.set_axis(['Courses_List', 'Course_Fee', 'Course_Duration'], axis=1, inplace=True)
print(df.columns)

10. Rename DataFrame Column using String replace()

Pandas String.replace() a method is used to replace a string, series, dictionary, list, number, regex, etc. from a DataFrame. This is a very rich function as it has many variations. If you have used this syntax: df.columns.str.replace("Fee","Fee Cost"), it replaces 'Fee' column with 'Fee_Cost'.


# Change column name using String.replace()
df.columns = df.columns.str.replace("Fee","Fee_Cost")
print(df.columns)

Yields below output.


Index(['Courses_List', 'Course_Fee_Cost', 'Course_Duration'], dtype='object')

To replace all column names.


# Rename all column names
df.columns = df.columns.str.replace("_"," ")
print(df.columns)

Yields below output.


Index(['Courses List', 'Course Fee Cost', 'Course Duration'], dtype='object')

11. Raise Error when Rename Column Not Found

By default when rename column label is not found on Pandas DataFrame, rename() method just ignores the column. In case if you wanted to throw an error when a column is not found, use errors = "raise".


# Throw Error when Rename column doesn't exists.
df.rename(columns = {'Cour':'Courses_List'}, errors = "raise")

Yields error message raise KeyError("{} not found in axis".format(missing_labels)).


raise KeyError("{} not found in axis".format(missing_labels))
KeyError: "['Cour'] not found in axis"

12. Pandas Rename Only If Column Exists.

This example changes the Courses column to Courses_List and it’s doesn’t update Fees as we don’t have the Fees column. Note that even though the Fees column does not exist it didn’t raise errors even when we used errors="raise".


# Change column only if column exists.
df = pd.DataFrame(technologies)
d={'Courses':'Courses_List','Fees':'Courses_fees'}
df.rename(columns={k: v for k, v in d.items() if k in df.columns}, inplace=True,errors = "raise")
print(df.columns)

Conclusion

In this article, you have learned rename of single or multiple columns can be done using the DataFrame rename(), set_axis() functions. And, changing the column names by adding prefix or suffix using add_prefix() & add_suffix() functions. Also learned renaming columns using user-defined functions and finally converting columns to lower and upper case. Hope you like it !!

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas Rename Column with Examples