Use mutate()
and its other verbs mutate_all()
, mutate_if()
and mutate_at()
from R dplyr
package to replace/update the values of the column (string, integer, or any type) in DataFrame (data.frame). For more methods of this package refer to the R dplyr tutorial.
dplyr is a third-party package hence, you need to load the library using library("dplyr")
to use its methods. In case you don’t have this package, install it using install.packages("dplyr")
.
For bigger data sets it is best to use the methods from dplyr
package as they perform 30% faster to replace column values. dplyr
package uses C++ code to evaluate.
Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV file to R DataFrame. Also, refer to Import Excel File into R.
# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))
df
# Output
# id address work_address
#1 1 Orange St Main St
#2 2 Anton Blvd <NA>
#3 3 Jefferson Pkwy Apple Blvd
#4 NA Portola Pkwy
1. Replace using dplyr mutate() – Update on Selected Column
Use mutate()
method from dplyr
package to replace R DataFrame column value. The following example replaces all instances of the street
with st
on the address
column.
library("dplyr")
# Replace on selected column
df <- df %>%
mutate(address = str_replace(address, "St", "Street"))
df
Here, %>%
is an infix operator which acts as a pipe, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.
2. Replace using dplyr mutate_all() – Update All Columns
Use mutate_all()
from dplyr
package to change values on all columns, the following example replaces all instances of Street
with St
on all columns. Since we have Street
on the address
and work_address
columns, these two would get updated.
library("dplyr")
# Replace on all columns
df <- df %>%
mutate_all(funs(str_replace(., "St", "Street")))
df
3. Replace using dplyr mutate_if() – Update On All Numeric Columns
Use mutate_if()
to update the column values conditionally, the following example replaces NA with 0 on all numeric columns. is.numeric selects only numeric columns.
library("dplyr")
library("tidyr")
#Example 3 - Replace only on numeric columns
df <- df %>%
mutate_if(is.numeric, ~replace_na(.,0))
df
4. Replace using dplyr mutate_at() – Update on Multiple Columns
mutate_all()
method is used to update on multiple selected columns by name. The following examples update address
and work_address
columns.
library("dplyr")
# Replace on selected columns
df <- df %>%
mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df
5. Replace using dplyr mutate_at() – Update on Selected Column Index Position
Similarly, you can also use mutate_all()
method to select multiple columns by position index and replace the specified values. The following example updates columns 2
and 3
which are the address
and work_address
columns.
library("dplyr")
# Replace on select index
df <- df %>%
mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df
6. Complete Example
Following is a complete example of using mutate()
, mutate_all()
, mutate_if()
and mutate_at()
from dplyr
to replace/change the column values in an R DataFrame
# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))
df
library("dplyr")
# Replace on selected columns
df <- df %>%
mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df
# Replace on select index
df <- df %>%
mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df
# Replace on all columns
library("dplyr")
df <- df %>%
mutate_all(funs(str_replace(., "St", "Street")))
df
#Example 3 - Replace only on numeric columns
library("tidyr")
df <- df %>%
mutate_if(is.numeric, ~replace_na(.,0))
df
7. Conclusion
In this article, you have learned how to use methods from dplyr package to replace/update values in an R dataframe. dplyr
is a third-party package hence, you need to load the library using library("dplyr")
to use its methods. In case you don’t have this package, install it using install.packages("dplyr")
.
Related Articles
- Convert DataFrame Column to Numeric Type in R
- Drop Dataframe Columns by Name in R
- How to Replace Empty String with NA in R?
- How to Replace Zero (0) with NA on R Dataframe Column?
- How to Replace NA with Empty String in an R DataFrame?
- R – Replace String with Another String or Character.
- R – Replace Values Based on Condition
- dplyr filter() Function in R
- dplyr select() Function in R
- dplyr slice() Function in R
- dplyr distinct() Function in R
Hello, great post but looks like it needs a bit of an update. I received the following error: “Warning: `funs()` was deprecated in dplyr 0.8.0.
Please use a list of either functions or lambdas”