• Post author:
  • Post category:Pandas
  • Post last modified:April 17, 2024
  • Reading time:14 mins read
You are currently viewing How to Replace String in Pandas DataFrame

In pandas, to replace a string in the DataFrame column, you can use either the replace() function or the str.replace() method along with lambda functions. In this article, I will explain how to replace the string in pandas DataFrame.

Key Points –

  • Use the str.replace() method in pandas to replace strings in DataFrame columns efficiently.
  • Specify the string to be replaced and its replacement within the method parameters.
  • Employ regular expressions for handling complex string replacement patterns effectively.
  • Ensure to assign the modified DataFrame back to the original DataFrame or a new variable to retain changes.

Quick Examples to Replace String

If you are in a hurry below are some examples of how to replace a string in Pandas DataFrame.


# Quick examples to replace string

# Example 1: Replace string 
# Using DataFrame.replace() method
df2 = df.replace('Py','Python with ', regex=True)

# Example 2: Replace pattern of string 
# Using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with', 'Duration': ' Days'}, regex=True)

# Example 3: Replace pattern of string 
# Using regular expression
df2=df.replace(regex=['Language'],value='Lang')

# Example 4: By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')

# Example 5: Replace String 
# Using apply() function with lambda
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))

To run some examples of the replace string in pandas DataFrame, let’s create a DataFrame with a few rows and columns, execute these examples, and validate results.


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","Java Language","PySpark","PHP Language"],
    'Fee' :[22000,25000,23000,24000,26000,27000],
    'Duration':['30days','50days','30days','60days','35days','30days']
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

pandas replace string

Pandas Replace String Example

You can replace strings within a Pandas DataFrame column using the DataFrame.replace() function. This function updates the specified value with another specified value and returns a new DataFrame. In order to update on existing DataFrame use inplace=True


# Replace string using DataFrame.replace() method.
df2 = df.replace('PySpark','Python with Spark')
print("After replacing the string values of a single column:\n", df2)

In the above example, you create a DataFrame df with columns Courses, Fee, and Duration. Then you use the DataFrame.replace() method to replace PySpark with Python with Spark in the Courses column. This example yields the below output.

pandas replace string

Replace Multiple Strings

Now let’s see how to replace multiple string column(s), In this example, I will also show how to replace part of the string by using regex=True param. To update multiple string columns, use the dict with a key-value pair. The below example updates Py with Python with on Courses column and days with Days on Duration column.


# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with ', 'Duration': ' Days'}, regex=True)
print("After replacing the string values of multiple columns:\n", df2)

Yields below output.


# Output:
# After replacing the string values of multiple columns
             Courses    Fee Duration
0              Spark  22000  30 Days
1  Python with Spark  25000  50 Days
2              Spark  23000  30 Days
3      Java Language  24000  60 Days
4  Python with Spark  26000  35 Days
5       PHP Language  27000  30 Days

Using Regular Expression

Alternatively, using regular expressions you can replace matching strings with other strings within a Pandas DataFrame. The below example finds a string Language and replace it with Lan.


# Replace pattern of string using regular expression
df2=df.replace(regex=['Language'],value='Lang')
print("After replacing the string values of a single column:\n", df2)

Yields below output.


# Output:
# After replacing the string values of a single column:
     Courses    Fee Duration
0      Spark  22000   30days
1    PySpark  25000   50days
2      Spark  23000   30days
3  Java Lang  24000   60days
4    PySpark  26000   35days
5   PHP Lang  27000   30days

Using str.replace() on DataFrame

You can use the str.replace() method directly on a DataFrame column to replace strings, repalce() looks for exact matches unless you pass a regex pattern and param regex=True.


# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print("After replacing the string values of a single column:\n", df)

In the above example, you create a DataFrame df with a column Courses containing strings. Then, you use the str.replace() method directly on the Courses column of the DataFrame to replace Language with Lang.

Using apply() Function Along with lambda

Similarly, you can use DataFrame.apply() with a lambda expression to replace strings. The apply() method allows you to apply a function along one of the axes of the DataFrame, by default 0, which is the index (row) axis.


# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print("After replacing the string values of a single column:\n", df2)

Yields below output.


# Output:
# After replacing the string values of a single column:
            Courses    Fee Duration
0             Spark  22000   30days
1  Python withSpark  25000   50days
2             Spark  23000   30days
3         Java Lang  24000   60days
4  Python withSpark  26000   35days
5          PHP Lang  27000   30days

Complete Example


# Create a pandas DataFrame
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","P","PySpark","P"],
    'Fee' :[22000,25000,23000,24000,26000,27000],
    'Duration':['30days','50days','30days','60days','35days','30days']
          }
df = pd.DataFrame(technologies)
print(df)

# Replace string using DataFrame.replace() method
df2 = df.replace('Py','Python with ', regex=True)
print(df2)

# Replace pattern of string using regular expression
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with', 'Duration': ' Days'}, regex=True)
print(df2)

# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
print(df2)

# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print(df)

# Replace String using apply() function with lambda
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print(df2)

Frequently Asked Questions on Replace String in DataFrame

How can I replace a specific string in a DataFrame column with another string?

To replace a specific string in a DataFrame column with another string, you can use the str.replace() method.

Can I replace multiple strings in a DataFrame column simultaneously?

You can replace multiple strings in a DataFrame column simultaneously using the replace() method with a dictionary.

How can I replace strings based on a pattern or using regular expressions?

To replace strings based on a pattern or using regular expressions in a Pandas DataFrame, you can use the str.replace() method with the regex=True parameter.

Can I apply a custom function to replace strings in a DataFrame column?

You can apply a custom function to replace strings in a DataFrame column using the apply() method.

Conclusion

In this article, You have learned how to replace the string in the Pandas column by using DataFrame.replace() and str.replace() with lambda function with some examples.

References