One of the most use cases we get while working with data in R DataFrame is curating it and one of the curation rules is to replace one string with another string and replace part of the string (substring)in a column. In this article, I will explain how to replace a string with another string on a single column, multiple columns, and by condition. This function can be used on both DataFrame columns and a vector.
Using some of the approaches explained in this article, you can also replace NA with Empty String in R dataframe. Let’s create an R DataFrame and explore examples and output.
# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))
df
# Output
# id address work_address
#1 1 Orange St Main St
#2 2 Anton Blvd <NA>
#3 3 Jefferson Pkwy Apple Blvd
#4 NA Portola Pkwy
1. Replace String with Another String On Single Column
The below example replaces a string with another string on a selected column by checking a condition, this updates only rows that satisfy the condition. When updating a specific column, you need to select the column using df$column_name
.
# Replace String with Another Stirng
df$address[df$address == 'Orange St'] <- 'Portola Pkwy'
df
Yields below output. Here, df$address
is a vector, and note that every column in a DataFrame is a vector.
#Output
id address work_address
1 1 Portola Pkwy Main St
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Blvd
4 NA Portola Pkwy
Similarly, you also replace column values based on multiple conditions.
2. Replace String with Another String on All Columns
Now let’s replace a string with another string on all columns of the R DataFrame. The following example updates all instances of Portola Pkwy
with Orange St
on all columns of R DataFrame.
# Replace String with Another String on All Columns
df[df=="Portola Pkwy"] <- "Orange St"
df
Yields below output.
# Output
id address work_address
1 1 Orange St Main St
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Blvd
4 NA Orange St
3. Using stringr Package to Replace Part of String (Substring)
In order to use the methods from stringr
package, first, you need to load its library using library("stringr")
. In case you don’t have this package, install it using install.packages("stringr")
. The stringr
package provides a set of functions to work with strings as easily as possible.
3.1 str_replace()
Use str_replace() method from stringr
package to replace part of a column string with another string in R DataFrame. I have created a dedicated R article on str_repalce() where I covered syntax, usage, and several examples.
# Replace String with another String
library(stringr)
df$address <- str_replace(df$address, "St", "Street")
print(df)
Yields below output. The following example replaces the string St
with Street
on column address
. Here, df$address
is a vector.
# Output
id address work_address
1 1 Orange Street Main St
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Blvd
4 NA Orange St
3.2. Replace Multiple Strings with Other Multiple Strings
Use str_replace_all()
method of stringr
package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string. The following example takes vector c()
with mapping of values to be replaced on work_address
column.
# Replace multiple strings at a time
rep_str = c('St'='Street','Blvd'='Boulevard','Pkwy'='Parkway')
df$work_address <- str_replace_all(df$work_address, rep_str)
df
Yields below output. This example updates part of a string with another string similarly, you can also use this to update the entire string in a column.
# Output
id address work_address
1 1 Orange Street Main Street
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Boulevard
4 NA Orange Street
4. Using dplyr package to Replace String
You need to load the library using library("dplyr")
to use its methods. In case you don’t have this package, install it using install.packages("dplyr")
. For bigger data sets, use the methods from dplyr
package as they perform 30% faster. dplyr
package uses C++ code to evaluate.
For more examples of using dplyr refer to Using mutate() from dplyr to replace column values in R.
4.1 Replace String with Another String
Let’s use mutate() method from dplyr package to replace column values in R. The following example replaces Street
string with St
string on the address
column.
# Load dplyr package
library('dplyr')
# Replace on address column
df <- df %>%
mutate(address = str_replace(address, "Street", "St"))
df
Yields below output. Here, %>%
is an infix operator which acts as a pipe, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.
# Output
id address work_address
1 1 Orange St Main Street
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Boulevard
4 NA Orange Street
4.2 Replace String on All Columns
Similarly, you can also use mutate_all()
method from dplyr
package to replace string with another string on all columns in R.
# Load dplyr package
library('dplyr')
# Replace String with Another String on All Columns
df <- df %>%
mutate_all(funs(str_replace(., "Orange", "Alton")))
df
Yields below output.
# Output
id address work_address
1 1 Alton St Main Street
2 2 Anton Blvd <NA>
3 3 Jefferson Pkwy Apple Boulevard
4 <NA> Alton Street
4.3 Replace String on Selected Columns
mutate_at()
takes the columns vector you wanted to replace and applies the function funs(ifelse(. == 'Alton St', 'Orange St', .))
to each column; .
stands for the matched column here.
Similarly, you can also pass column index position as param to replace string by selecting columns by index.
# Replace on Selected Columns
df <- df %>%
mutate_at(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df
4.4 Replace Substring on Selected Columns
Finally, let’s see how to replace substring with another string on selected columns.
# Replace substring on selected columns
df <- df %>%
mutate_at(c('address','work_address'),funs(str_replace(., "Alton", "Orange")))
dft(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df
5. Complete Example
# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))
df
# Example 1
#Replace Column Value with Another Stirng
df$address[df$address == 'Orange St'] <- 'Portola Pkwy'
df
# Example 2
# Replace String with Another String on All Columns
df[df=="Portola Pkwy"] <- "Orange St"
df
# Example 3
# Replace String with another String
library(stringr)
df$address <- str_replace(df$address, "St", "Street")
df
# Example 4
# Replace multiple strings at a time
rep_str = c('St'='Street','Blvd'='Boulevard','Pkwy'='Parkway')
df$work_address <- str_replace_all(df$work_address, rep_str)
df
# Load dplyr package
library('dplyr')
# Example 5
# Replace String with Another String
df <- df %>%
mutate(address = str_replace(address, "Street", "St"))
df
# Example 6
# Replace String with Another String on All Columns
df <- df %>%
mutate_all(funs(str_replace(., "Orange", "Alton")))
df
# Example 7
# Replace on Selected Columns
df <- df %>%
mutate_at(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df
# Example 8
# Replace substring on selected columns
df <- df %>%
mutate_at(c('address','work_address'),funs(str_replace(., "Alton", "Orange")))
dft(c('address','work_address'),funs(ifelse(. == 'Alton St', 'Orange St', .)))
df
6. Conclusion
In this article, you have learned different ways to replace a string with another string on a single column, all columns, replacing part of the string with examples.
Related Articles
- How to Replace Character in a String
- How to Replace Column Value with Another Column
- How to Replace Empty String with NA
- How to Replace Zero (0) with NA on Dataframe Column
- How to Replace NA with 0 in Multiple Columns
- How to Replace NA values with 0 (zero)
- R select() Function from dplyr – Usage with Examples
- How to Replace Values in R with Examples?