pandas DataFrame replace() – by Examples

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:February 6, 2022

pandas.DataFrame.replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame. When inplace=True is used, it replaces on existing DataFrame object and returns None value.

This function is used to replace column values of str, regex, list, dict, Series, int, float with specified values. In this article, I will explain pandas replace() method syntax, usage with examples.

It is one of the most useful functions and most powerful as it replaces values by matching with regex (regular expression).

1. replace() Syntax

Below is the syntax of the replace() method. This is also used to replace substring in column.


# replace() syntax
DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')
  • to_replace – Takes str, regex, list, dict, Series, int, float, or None
  • valuescalar, dict, list, str, regex, default None
  • inplacebool, default False
  • limitint, default None
  • regexbool or same types as to_replace, default False
  • method{‘pad’, ‘ffill’, ‘bfill’, None}

2. pandas replace() Examples

pandas replace() method is used to find a value on a DataFrame and replace it with another value on all columns & rows.


# Replace column value
df2=df.replace('Spark','Apache Spark')
print(df2)

Yields below output. This replaces 'Spark' with 'Apache Spark' on entire DataFrame and returns a new object. Use inplace=True param to update on existing DataFrame object. This ideally replaces the string with another string.


        Courses    Fee Duration
0  Apache Spark  22000   30days
1       PySpark  25000   50days
2  Apache Spark  23000   30days
3        Python  24000   35days
4       PySpark  26000      NaN

To replace NaN values, use DataFrame.fillna() function to replace NaN with empty/bank.

3. Replace Values in Column

In case you wanted to replace values in a given column of pandas DataFrame, first, select the column you wanted to update values and use replace() method.


df['Courses'] = df['Courses'].replace('Spark','Apache Spark')
print(df)

Yields same output as above.

4. Replace with Multiple Values

Now, let’s see how to find multiple values from a list and replace them with other values in a list.


# Replace multiple values
df2 = df.replace(['Spark','PySpark'],['Apache Spark', 'Apache PySpark'])
print(df)

Yields below output


          Courses    Fee Duration
0    Apache Spark  22000   30days
1  Apache PySpark  25000   50days
2    Apache Spark  23000   30days
3          Python  24000   35days
4  Apache PySpark  26000      NaN

You can also replace with the same value for multiple values


# Replace with same value for multiple
df2 = df.replace(['30days','35days'],'40days')
print(df2)

Yields below output.


        Courses    Fee Duration
0  Apache Spark  22000   40days
1       PySpark  25000   50days
2  Apache Spark  23000   40days
3        Python  24000   40days
4       PySpark  26000      NaN

5. Replace with Dict

The below examples replace from and to values by using Dict.


# Replace on multiple columns
df2 = df.replace({'Courses': 'Apache Spark', 'Duration': '35days'}, 
                 {'Courses': 'Spark', 'Duration': '40days'})
print(df2)

Yields below output.


   Courses    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   40days
4  PySpark  26000      NaN

Conclusion

You have learned replace() method is used to replacing the column value, regex, list, dictionary, series, number, etc with another value.

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing pandas DataFrame replace() – by Examples