R dplyr::mutate() – Replace Column Values

Use mutate() and its other verbs mutate_all(), mutate_if() and mutate_at() from dplyr package to replace/update the values of the column (string, integer, or any type) in R DataFrame (data.frame). For more methods of this package refer to the R dplyr tutorial.

dplyr is a third-party package hence, you need to load the library using library("dplyr") to use its methods. In case you don’t have this package, install it using install.packages("dplyr").

For bigger data sets it is best to use the methods from dplyr package as they perform 30% faster to replace column values. dplyr package uses C++ code to evaluate.

Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV file to R DataFrame. Also, refer to Import Excel File into R.


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
         address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
         work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

# Output
#  id        address work_address
#1  1      Orange St      Main St
#2  2     Anton Blvd         <NA>
#3  3 Jefferson Pkwy   Apple Blvd
#4 NA                Portola Pkwy

1. Replace using dplyr::mutate() – Update on Selected Column

Use mutate() method from dplyr package to replace R DataFrame column value. The following example replaces all instances of the street with st on the address column.


library("dplyr")
# Replace on selected column
df <- df %>% 
  mutate(address = str_replace(address, "St", "Street"))
df

Here, %>% is an infix operator which acts as a pipe, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.

2. Replace using dplyr::mutate_all() – Update All Columns

Use mutate_all() from dplyr package to change values on all columns, the following example replaces all instances of Street with St on all columns. Since we have Street on the address and work_address columns, these two would get updated.


library("dplyr")
# Replace on all columns
df <- df %>% 
  mutate_all(funs(str_replace(., "St", "Street")))
df

3. Replace using dplyr::mutate_if() – Update On All Numeric Columns

Use mutate_if() to update the column values conditionally, the following example replaces NA with 0 on all numeric columns. is.numeric selects only numeric columns.


library("dplyr")
library("tidyr")
#Example 3 - Replace only on numeric columns
df <- df %>% 
  mutate_if(is.numeric, ~replace_na(.,0))
df

4. Replace using dplyr::mutate_at() – Update on Multiple Columns

mutate_all() method is used to update on multiple selected columns by name. The following examples update address and work_address columns.


library("dplyr")
# Replace on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df

5. Replace using dplyr::mutate_at() – Update on Selected Column Index Position

Similarly, you can also use mutate_all() method to select multiple columns by position index and replace the specified values. The following example updates columns 2 and 3 which are the address and work_address columns.


library("dplyr")
# Replace on select index
df <- df %>% 
  mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df

6. Complete Example

Following is a complete example of using mutate(), mutate_all(), mutate_if() and mutate_at() from dplyr to replace/change the column values in an R DataFrame


# Create DataFrame
df <- data.frame(id=c(1,2,3,NA),
      address=c('Orange St','Anton Blvd','Jefferson Pkwy',''),
      work_address=c('Main St',NA,'Apple Blvd','Portola Pkwy'))

df

library("dplyr")
# Replace on selected columns
df <- df %>% 
  mutate_at(c('address','work_address'),funs(str_replace(., "St", "Street")))
df

# Replace on select index
df <- df %>% 
  mutate_at(c(2,3),funs(str_replace(., "St", "Street")))
df

# Replace on all columns
library("dplyr")
df <- df %>% 
  mutate_all(funs(str_replace(., "St", "Street")))
df

#Example 3 - Replace only on numeric columns
library("tidyr")
df <- df %>% 
  mutate_if(is.numeric, ~replace_na(.,0))
df

7. Conclusion

In this article, you have learned how to use methods from dplyr package to replace/update values in an R dataframe. dplyr is a third-party package hence, you need to load the library using library("dplyr") to use its methods. In case you don’t have this package, install it using install.packages("dplyr").

Related Articles

References

r replace dplyr

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing R dplyr::mutate() – Replace Column Values