Remove Character From String in R

How to remove a character or multiple characters from a string in R? You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.

In this article, I will explain how to remove a single character or multiple characters from a String in R by using gsub() and str_replace() functions.

1. Remove Specific Character from String

Use gsub() function to remove a character from a string or text in R. This is an R base function that takes 3 arguments, first, the character to look for, second, the value to replace with, in our case we use blank string, and the third input string were to replace.


# Remove Single Character
address_str <- "127 Anton Blvd, Apt #7 - Wilmington, DE"
new_str <- gsub(',','',address_str)
print(new_str)

# Output
# [1] "127 Anton Blvd Apt #7 - Wilmington DE"

2. Remove Multiple Characters from String

gsub() function is also used to remove multiple characters from the String. To look for multiple strings, you have to use []. The below example removes Anton and Apt strings from the String.


# Remove Multiple Characters
address_str <- "127 Anton Blvd, Apt #7 - Wilmington, DE"
new_str <- gsub('[AntonApt]','',address_str)
print(new_str)

# Output
# [1] "127  Blvd,  #7 - Wilmig, DE"

3. Remove Special Characters from String

To remove all special characters use ^[:alnum:] to gsub() function, the following example removes all special characters [that are not a number and alphabet characters] from R data.frame.


# Remove Special Characters
address_str <- "127 Anton Blvd, Apt #7 - Wilmington, DE"
new_str <- gsub('[^[:alnum:] ]','',address_str)
print(new_str)

# Output
# [1] "127 Anton Blvd Apt 7  Wilmington DE"

4. Remove Spaces in a String

Sometimes we would be required to remove spaces from a string or remove tab or newline characters, you can easily achieve this by using the below example.


# Remove space from String
address_str <- "127 Anton Blvd, Apt #7 - Wilmington, DE"
new_str <- gsub(' ','',address_str)
print(new_str)

# Remove tab and new line charater
new_str <- gsub('[\t\n]','',address_str)
print(new_str)

Conclusion

In this article, I have explained how to remove a single character, or multiple characters from the string in R. Also learned to remove special characters, characters by position, and many more examples.

References

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

You are currently viewing Remove Character From String in R