You are currently viewing R dplyr mutate() – Replace Column Values

Use mutate() and its other verbs mutate_all(), mutate_if() and mutate_at() from R dplyr package to replace/update the values of the column (string, integer, or any type) in data frame (data.frame). For more methods of this package refer to the R dplyr tutorial.

Advertisements

To access the methods of the dplyr package, you need to load it with library("dplyr"). If you haven’t installed the package yet, you can do so by running install.packages("dplyr").

For bigger data sets it is best to use the methods from the dplyr package as they perform 30% faster to replace column values. This package uses C++ code to evaluate.

Let’s create an R DataFrame, execute these examples, and examine the result.


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
         address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
         work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

# Output
#  id        address work_address
#1  1      Orange St      Main St
#2  2     Anton Blvd         <NA>
#3  3 Jefferson Pkwy   Apple Blvd
#4 NA                Portola Pkwy

1. Replace using dplyr mutate() – Update on Selected Column

Use mutate() method from dplyr package to replace the R DataFrame column value. The following example replaces all instances of the street with st on the address column.


library("dplyr")
# Replace on selected column
df <- df %>% 
  mutate(address = str_replace(address, "St", "Street"))
df

Here, %>% is an infix operator which acts as a pipe, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.

2. Replace using dplyr mutate_all() – Update All Columns

Use mutate_all() from dplyr package to change values on all columns, the following example replaces all instances of Street with St on all columns. Since we have Street on the address and work_address columns, these two would get updated.


library("dplyr")
# Replace on all columns
df <- df %>% 
  mutate_all(funs(str_replace(., "St", "Street")))
df

3. Replace using dplyr mutate_if() – Update On All Numeric Columns

Use mutate_if() to update the column values conditionally, the following example replaces NA with 0 on all numeric columns. is.numeric selects only numeric columns.


library("dplyr")
library("tidyr")
#Example 3 - Replace only on numeric columns
df <- df %>% 
  mutate_if(is.numeric, ~replace_na(.,0))
df

4. Replace using dplyr mutate_at() – Update on Multiple Columns

mutate_all() method is used to update multiple selected columns by name. The following examples update address and work_address columns.


library("dplyr")
# Replace on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df

5. Replace using dplyr mutate_at() – Update on Selected Column Index Position

Similarly, you can also use mutate_all() method to select multiple columns by position index and replace the specified values. The following example updates columns 2 and 3 which are the address and work_address columns.


library("dplyr")
# Replace on select index
df <- df %>% 
  mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df

6. Complete Example

Following is a complete example of using mutate(), mutate_all(), mutate_if() and mutate_at() from dplyr to replace/change the column values in an R DataFrame


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
      address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
      work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

library("dplyr")
# Replace on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df

# Replace on select index
df <- df %>% 
  mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df

# Replace on all columns
library("dplyr")
df <- df %>% 
  mutate_all(funs(str_replace(., "St", "Street")))
df

#Example 3 - Replace only on numeric columns
library("tidyr")
df <- df %>% 
  mutate_if(is.numeric, ~replace_na(.,0))
df

7. Conclusion

In this article, you have learned how to use methods from the dplyr package to replace/update values in an R dataframe. Since dplyr is a third-party package, you need to load it by using library("dplyr") to access its methods. If you don’t have this package installed, you can install it with install.packages("dplyr").

References

This Post Has One Comment

  1. Chuck

    Hello, great post but looks like it needs a bit of an update. I received the following error: “Warning: `funs()` was deprecated in dplyr 0.8.0.
    Please use a list of either functions or lambdas”

Comments are closed.