How to Replace String in pandas DataFrame

  • Post author:
  • Post category:Pandas
  • Post last modified:November 23, 2023

You can replace a string in the pandas DataFrame column by using replace(), str.replace() with lambda functions. In this article, I will explain how to replace the string of the DataFrame column with multiple examples.

  • Replace a string with another string in pandas.
  • Replace a pattern of a string with another string using regular expression.

1. Quick Examples to Replace String in DataFrame

If you are in a hurry below are some examples of how to replace a string in Pandas DataFrame.


# Below are some quick examples.

# Example 1: Replace string using DataFrame.replace() method.
df2 = df.replace('Py','Python with ', regex=True)

# Example 2: Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with', 'Duration': ' Days'}, regex=True)

# Example 3: Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')

# Example 4: By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')

# Example 5: Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))

Now, let’s create a pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names CoursesFee and Duration.


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","Java Language","PySpark","PHP Language"],
    'Fee' :[22000,25000,23000,24000,26000,27000],
    'Duration':['30days','50days','30days','60days','35days','30days']
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

pandas replace string

2. pandas Replace String Example

You can replace the string of the pandas DataFrame column with another string by using DataFrame.replace() method. This method updates the specified value with another specified value and returns a new DataFrame. In order to update on existing DataFrame use inplace=True


# Replace string using DataFrame.replace() method.
df2 = df.replace('PySpark','Python with Spark')
print("After replacing the string values of a single column:\n", df2)

Yields below output. This example replaces the string PySpark with Python with Spark.

pandas replace string

3. Replace Multiple Strings

Now let’s see how to replace multiple string column(s), In this example, I will also show how to replace part of the string by using regex = True param. To update multiple string columns, use the dict with a key-value pair. The below example updates Py with Python with on Courses column and days with Days on Duration column.


# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with ', 'Duration': ' Days'}, regex=True)
print("After replacing the string values of multiple columns:\n", df2)

Yields below output.


# Output:
# After replacing the string values of multiple columns:
             Courses    Fee Duration
0              Spark  22000  30 Days
1  Python with Spark  25000  50 Days
2              Spark  23000  30 Days
3      Java Language  24000  60 Days
4  Python with Spark  26000  35 Days
5       PHP Language  27000  30 Days

4. Replace Pattern of String Using Regular Expression

Using regular expression you can replace the matching string with another string in pandas DataFrame. The below example finds a string Language and replace it with Lan.


# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
print("After replacing the string values of a single column:\n", df2)

Yields below output.


# Output:
# After replacing the string values of a single column:
     Courses    Fee Duration
0      Spark  22000   30days
1    PySpark  25000   50days
2      Spark  23000   30days
3  Java Lang  24000   60days
4    PySpark  26000   35days
5   PHP Lang  27000   30days

5. Using str.replace() on DataFrame

Alternatively, use str.replace() to replace a string, repalce() looks for exact matches unless you pass a regex pattern and param regex=True.


# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print("After replacing the string values of a single column:\n", df)

Yields the same output as above. Note that this replaces the value on the Courses column on the existing DataFrame object.

6. Replace String Using apply() function with lambda

In this section, you can find out how to replace string using DataFrame.apply() with lambda expression. The apply() method allows you to apply a function along one of the axes of the DataFrame, default 0, which is the index (row) axis.


# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print("After replacing the string values of a single column:\n", df2)

Yields below output.


# Output:
# After replacing the string values of a single column:
            Courses    Fee Duration
0             Spark  22000   30days
1  Python withSpark  25000   50days
2             Spark  23000   30days
3         Java Lang  24000   60days
4  Python withSpark  26000   35days
5          PHP Lang  27000   30days

7. Complete Example of Replace String in DataFrame


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Spark","P","PySpark","P"],
    'Fee' :[22000,25000,23000,24000,26000,27000],
    'Duration':['30days','50days','30days','60days','35days','30days']
          }
df = pd.DataFrame(technologies)
print(df)

# Replace string using DataFrame.replace() method.
df2 = df.replace('Py','Python with ', regex=True)
print(df2)

# Replace pattern of string using regular expression.
df2 = df.replace({'Courses': 'Py', 'Duration': 'days'}, 
    {'Courses': 'Python with', 'Duration': ' Days'}, regex=True)
print(df2)

# Replace pattern of string using regular expression.
df2=df.replace(regex=['Language'],value='Lang')
print(df2)

# By using str.replace()
df['Courses'] = df['Courses'].str.replace('Language','Lang')
print(df)

# Replace String using apply() function with lambda.
df2 = df.apply(lambda x: x.replace({'Py':'Python with', 'Language':'Lang'}, regex=True))
print(df2)

Frequently Asked Questions on Replace String in DataFrame

How do I replace a specific string with another string in a DataFrame column?

You can use the replace() method on the DataFrame, specifying the string you want to replace and the string you want to replace it with. For example, df2 = df.replace('existing_str','new_str')

How can I replace multiple strings at once in a DataFrame column?

You can use the replace() method with a dictionary to replace multiple strings at once. The keys of the dictionary are the strings to be replaced, and the values are the replacement strings. For example, df2 = df.replace({'col1': 'exist_col1_value', 'col2': '<code>exist_col2_value‘}, {‘col1’: ‘new_col1_value’, ‘col2’: ‘new_col2_value‘}, regex=True)

8. Conclusion

In this article, You have learned how to replace the string in the Pandas column by using DataFrame.replace() and str.replace() with lambda function with some examples.

References

Leave a Reply

You are currently viewing How to Replace String in pandas DataFrame