You are currently viewing R – Replace String with Another String or Character

One of the most use cases we get while working with data in R DataFrame is curating it and one of the curation rules is to replace one string with another string and replace part of the string (substring)in a column. In this article, I will explain how to replace a string with another string on a single column, multiple columns, and by condition. This function can be used on both DataFrame columns and a vector.

Using some of the approaches explained in this article, you can also replace NA with Empty String in R dataframe. Let’s create an R DataFrame and explore examples and output.


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
    address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
    work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

# Output
#  id        address work_address
#1  1      Orange St      Main St
#2  2     Anton Blvd         <NA>
#3  3 Jefferson Pkwy   Apple Blvd
#4 NA                Portola Pkwy

1. Replace String with Another String On Single Column

The below example replaces a string with another string on a selected column by checking a condition, this updates only rows that satisfy the condition. When updating a specific column, you need to select the column using df$column_name.


# Replace String with Another Stirng
df$address[df$address == 'Orange St'] <- 'Portola Pkwy'
df

Yields below output. Here, df$address is a vector, and note that every column in a DataFrame is a vector.


#Output
  id        address work_address
1  1   Portola Pkwy      Main St
2  2     Anton Blvd         <NA>
3  3 Jefferson Pkwy   Apple Blvd
4 NA                Portola Pkwy

Similarly, you also replace column values based on multiple conditions.

2. Replace String with Another String on All Columns

Now let’s replace a string with another string on all columns of the R DataFrame. The following example updates all instances of Portola Pkwy with Orange St on all columns of R DataFrame.


# Replace String with Another String on All Columns
df[df=="Portola Pkwy"] <- "Orange St"
df

Yields below output.


# Output
  id        address work_address
1  1      Orange St      Main St
2  2     Anton Blvd         <NA>
3  3 Jefferson Pkwy   Apple Blvd
4 NA                   Orange St

3. Using stringr Package to Replace Part of String (Substring)

In order to use the methods from stringr package, first, you need to load its library using library("stringr"). In case you don’t have this package, install it using install.packages("stringr"). The stringr package provides a set of functions to work with strings as easily as possible. 

3.1 str_replace()

Use str_replace() method from stringr package to replace part of a column string with another string in R DataFrame. I have created a dedicated R article on str_repalce() where I covered syntax, usage, and several examples.


# Replace String with another String
library(stringr)
df$address <- str_replace(df$address, "St", "Street")
print(df)

Yields below output. The following example replaces the string St with Street on column address. Here, df$address is a vector.


# Output
  id        address work_address
1  1  Orange Street      Main St
2  2     Anton Blvd         <NA>
3  3 Jefferson Pkwy   Apple Blvd
4 NA                   Orange St

3.2. Replace Multiple Strings with Other Multiple Strings

Use str_replace_all() method of stringr package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string. The following example takes vector c() with mapping of values to be replaced on work_address column.


# Replace multiple strings at a time
rep_str = c('St'='Street','Blvd'='Boulevard','Pkwy'='Parkway')
df$work_address <- str_replace_all(df$work_address, rep_str)
df

Yields below output. This example updates part of a string with another string similarly, you can also use this to update the entire string in a column.


# Output
  id        address    work_address
1  1  Orange Street     Main Street
2  2     Anton Blvd            <NA>
3  3 Jefferson Pkwy Apple Boulevard
4 NA                  Orange Street

4. Using dplyr package to Replace String

You need to load the library using library("dplyr") to use its methods. In case you don’t have this package, install it using install.packages("dplyr"). For bigger data sets, use the methods from dplyr package as they perform 30% faster. dplyr package uses C++ code to evaluate.

For more examples of using dplyr refer to Using mutate() from dplyr to replace column values in R.

4.1 Replace String with Another String

Let’s use mutate() method from dplyr package to replace column values in R. The following example replaces Street string with St string on the address column.


# Load dplyr package
library('dplyr')

# Replace on address column
df <- df %>% 
  mutate(address = str_replace(address, "Street", "St"))
df

Yields below output. Here, %>% is an infix operator which acts as a pipe, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.


# Output
  id        address    work_address
1  1      Orange St     Main Street
2  2     Anton Blvd            <NA>
3  3 Jefferson Pkwy Apple Boulevard
4 NA                  Orange Street

4.2 Replace String on All Columns

Similarly, you can also use mutate_all() method from dplyr package to replace string with another string on all columns in R.


# Load dplyr package
library('dplyr')

# Replace String with Another String on All Columns
df <- df %>% 
  mutate_all(funs(str_replace(., "Orange", "Alton")))
df

Yields below output.


# Output
    id        address    work_address
1    1       Alton St     Main Street
2    2     Anton Blvd            <NA>
3    3 Jefferson Pkwy Apple Boulevard
4 <NA>                   Alton Street

4.3 Replace String on Selected Columns

mutate_at() takes the columns vector you wanted to replace and applies the function funs(ifelse(. == 'Alton St', 'Orange St', .)) to each column; . stands for the matched column here.

Similarly, you can also pass column index position as param to replace string by selecting columns by index.


# Replace on Selected Columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df

4.4 Replace Substring on Selected Columns

Finally, let’s see how to replace substring with another string on selected columns.


# Replace substring on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "Alton", "Orange")))
dft(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df

5. Complete Example


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
   address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
   work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

# Example 1
#Replace Column Value with Another Stirng
df$address[df$address == 'Orange St'] <- 'Portola Pkwy'
df

# Example 2
# Replace String with Another String on All Columns
df[df=="Portola Pkwy"] <- "Orange St"
df

# Example 3
# Replace String with another String
library(stringr)
df$address <- str_replace(df$address, "St", "Street")
df

# Example 4
# Replace multiple strings at a time
rep_str = c('St'='Street','Blvd'='Boulevard','Pkwy'='Parkway')
df$work_address <- str_replace_all(df$work_address, rep_str)
df

# Load dplyr package
library('dplyr')

# Example 5
# Replace String with Another String
df <- df %>% 
  mutate(address = str_replace(address, "Street", "St"))
df

# Example 6
# Replace String with Another String on All Columns
df <- df %>% 
  mutate_all(funs(str_replace(., "Orange", "Alton")))
df

# Example 7
# Replace on Selected Columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df

# Example 8
# Replace substring on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "Alton", "Orange")))
dft(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df

6. Conclusion

In this article, you have learned different ways to replace a string with another string on a single column, all columns, replacing part of the string with examples.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium