You can replace a string in the pandas DataFrame column by using replace(), str.replace()
with lambda
functions. In this article, I will explain how to replace the string of the DataFrame column with multiple examples.
- Replace a string with another string in pandas.
- Replace a pattern of string with another string using regular expression.
1. Quick Examples to Replace String in DataFrame
If you are in hurry below are some examples of how to replace a string in pandas DataFrame.
# Below are some quick examples.
# Replace string using DataFrame.replace() method.
df2 = df.replace('Py','Python with ', regex=True)
# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'},
{'Courses': 'Python with', 'Duration': ' Days'}, regex=True)
# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
Now, let’s create a pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
and Duration
.
# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Java Language","PySpark","PHP Language"],
'Fee' :[22000,25000,23000,24000,26000,27000],
'Duration':['30days','50days','30days','60days','35days','30days']
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
Courses Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Java Language 24000 60days
4 PySpark 26000 35days
5 PHP Language 27000 30days
2. pandas Replace String Example
You can replace the string of pandas DataFrame column with another string by using DataFrame.replace() method. This method updates the specified value with another specified value and returns a new DataFrame. In order to update on existing DataFrame use inplace=True
# Replace string using DataFrame.replace() method.
df2 = df.replace('PySpark','Python with Spark')
print(df2)
Yields below output. This example replaces the string PySpark
with Python with Spark
.
Courses Fee Duration
0 Spark 22000 30days
1 Python with Spark 25000 50days
2 Spark 23000 30days
3 Java Language 24000 60days
4 Python with Spark 26000 35days
5 PHP Language 27000 30days
3. Replace Multiple Strings
Now let’s see how to replace multiple strings column(s), In this example, I will also show how to replace part of the string by using regex=True
param. To update multiple string columns, use the dict with key-value pair. The below example updates Py
with Python
with on Courses
column and days
with Days
on Duration
column.
# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'},
{'Courses': 'Python with ', 'Duration': ' Days'}, regex=True)
print(df2)
Yields below output.
Courses Fee Duration
0 Spark 22000 30 Days
1 Python with Spark 25000 50 Days
2 Spark 23000 30 Days
3 Java Language 24000 60 Days
4 Python with Spark 26000 35 Days
5 PHP Language 27000 30 Days
4. Replace Pattern of String Using Regular Expression
Using regular expression you can replace the matching string with another string in pandas DataFrame. The below example find string Language
and replace it with Lan
.
# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
print(df2)
Yields below output.
Courses Fee Duration
0 Spark 22000 30days
1 PySpark 25000 50days
2 Spark 23000 30days
3 Java Lang 24000 60days
4 PySpark 26000 35days
5 PHP Lang 27000 30days
5. Using str.replace() on DataFrame
Alternatively, use str.replace()
to replace a string, repalce()
looks for exact matches unless you pass a regex pattern and param regex=True
.
# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print(df)
Yields same output as above. Note that this replaces the value on the Courses
column on the existing DataFrame object.
6. Replace String Using apply() function with lambda
In this section, you can find out how to replace string using DataFrame.apply() with lambda expression. The apply()
method allows you to apply a function along one of the axis of the DataFrame, default 0, which is the index (row) axis.
# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print(df2)
Yields below output.
Courses Fee Duration
0 Spark 22000 30days
1 Python withSpark 25000 50days
2 Spark 23000 30days
3 Java Lang 24000 60days
4 Python withSpark 26000 35days
5 PHP Lang 27000 30days
7. Complete Example of Replace String in DataFrame
# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","P","PySpark","P"],
'Fee' :[22000,25000,23000,24000,26000,27000],
'Duration':['30days','50days','30days','60days','35days','30days']
}
df = pd.DataFrame(technologies)
print(df)
# Replace string using DataFrame.replace() method.
df2 = df.replace('Py','Python with ', regex=True)
print(df2)
# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'},
{'Courses': 'Python with', 'Duration': ' Days'}, regex=True)
print(df2)
# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
print(df2)
# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print(df)
# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print(df2)
Conclusion
In this article, You have learned how to replace the string in pandas column by using DataFrame.replace() and str.replace()
with lambda
function with some examples.
Related Articles
- Change String Object to Date in Pandas DataFrame
- Count(Distinct) SQL Equivalent in Pandas DataFrame
- Convert Date (datetime) to String Format
- Pandas Filter DataFrame Rows on Dates
- Pandas Groupby Columns and Get Count
- Pandas Handle Missing Data in Dataframe
- How to Reshape Pandas Series?
- Pandas Replace Column value in DataFrame