pandas.DataFrame.replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame. When inplace=True
is used, it replaces on existing DataFrame object and returns None
value.
This function is used to replace column values of str, regex, list, dict, Series, int, float with specified values. In this article, I will explain pandas replace() method syntax, usage with examples.
It is one of the most useful functions and most powerful as it replaces values by matching with regex (regular expression).
1. replace() Syntax
Below is the syntax of the replace() method. This is also used to replace substring in column.
# Replace() syntax
DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')
to_replace
– Takes str, regex, list, dict, Series, int, float, or Nonevalue
– scalar, dict, list, str, regex, default Noneinplace
– bool, default Falselimit
– int, default Noneregex
– bool or same types as to_replace, default Falsemethod
– {‘pad’, ‘ffill’, ‘bfill’, None}
2. pandas replace() Examples
pandas replace() method is used to find a value on a DataFrame and replace it with another value on all columns & rows.
# Replace column value
df2=df.replace('Spark','Apache Spark')
print(df2)
Yields below output. This replaces 'Spark'
with 'Apache Spark'
on entire DataFrame and returns a new object. Use inplace=True
param to update on existing DataFrame object. This ideally replaces the string with another string.
# Output:
Courses Fee Duration
0 Apache Spark 22000 30days
1 PySpark 25000 50days
2 Apache Spark 23000 30days
3 Python 24000 35days
4 PySpark 26000 NaN
To replace NaN values, use DataFrame.fillna() function to replace NaN with empty/bank.
3. Replace Values in Column
In case you wanted to replace values in a given column of pandas DataFrame, first, select the column you wanted to update values and use replace() method.
# Replace Values in Column
df['Courses'] = df['Courses'].replace('Spark','Apache Spark')
print(df)
Yields same output as above.
4. Replace with Multiple Values
Now, let’s see how to find multiple values from a list and replace them with other values in a list.
# Replace multiple values
df2 = df.replace(['Spark','PySpark'],['Apache Spark', 'Apache PySpark'])
print(df)
Yields below output
# Output:
Courses Fee Duration
0 Apache Spark 22000 30days
1 Apache PySpark 25000 50days
2 Apache Spark 23000 30days
3 Python 24000 35days
4 Apache PySpark 26000 NaN
You can also replace with the same value for multiple values
# Replace with same value for multiple
df2 = df.replace(['30days','35days'],'40days')
print(df2)
Yields below output.
# Output:
Courses Fee Duration
0 Apache Spark 22000 40days
1 PySpark 25000 50days
2 Apache Spark 23000 40days
3 Python 24000 40days
4 PySpark 26000 NaN
5. Replace with Dict
The below examples replace from and to values by using Dict.
# Replace on multiple columns
df2 = df.replace({'Courses': 'Apache Spark', 'Duration': '35days'},
{'Courses': 'Spark', 'Duration': '40days'})
print(df2)
Yields below output.
# Output:
Courses Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Python 24000 40days
4 PySpark 26000 NaN
Conclusion
You have learned replace() method is used to replacing the column value, regex, list, dictionary, series, number, etc with another value.
Related Articles
- How to Replace String in pandas DataFrame
- Pandas Replace substring in DataFrame
- How to Change Column Name in pandas
- Pandas Replace Column value in DataFrame
- How to Rename Specific Columns in Pandas
- pandas.DataFrame.fillna() – Explained by Examples
- Pandas Convert Column to Float in DataFrame
- Pandas Rename Column with Examples