You are currently viewing Pipe in R with Examples

Pipe %>% in R is the most used operator that was introduced in magrittr package by Stefan Milton Bache. The pipe operator %>% is used to express a sequence of multiple operations, for example, the output of one function or expression is passed to another function as an argument.

Key Points of using Pipe in R

  • Pipe %>% in R is introduced in magrittr package
  • When using the tidyverse package, this %>% pipe operator is automatically available for you to use.
  • It takes the output of one function and passes it into another function as an argument
  • It works with a function that takes one input.
  • If a function needs two inputs then it can’t be used.

1. What is Pipe Operator in R – Introduction

Pipe in R is an infix operator that was introduced in magrittr package by Stefan Milton Bache, which is used to pass the output of one function as an input to another function which ideally makes our code easily readable and efficient. In other words pipe operator %>% is used to express a sequence of multiple operations in an elegant way.

2. Does Pipe Exists in other Languages

If you are familiar with Linux, you would probably know the pipe operator | that is used to pass the output of one command to another. So the Pipe is nothing new to the programming its existence has been there for a while.


# Pipe | operator in Linux
ls -lrt | grep new_file | <additional comamnds>

3. Why do we need Pipe in R

When you have complex code to write in R Programming, sometimes you nest the operations which result in unreadable R code and it’s hard for others to understand. By using the pipe %>% operator to can avoid writing complex code to some extent by chaining them.

Also, By using the pipe %>% operator you can save the memory footprint of your R program. hence, it’s been used in many R packages now. You might ask how it saves memory.

For example, imagine you have 3 function calls and the result of each function is passed as input to another function, when you don’t have %>% operator you would store the result of each function into a variable and use this variable on another function, so here we are keeping 3 variable in memory with the data. By using a pipe, you can avoid this intermediate variable and chain the functions using pipe operator.

3.1 Example without Pipe in R

Following is a basic example of using 3 functions, and the output of each function is passed as input to another function.


# Simile example with out Pipe operator
# add function
add <- function(x,y) {
  return (x + y)
}
# multiply function
mul <- function(x,y) {
  return (x * y)
}
# div function
div <- function(x,y) {
  return (x / y)
}

# calling functions sequentially
res1 <- add(2,4)
res2 <- mul(res1,8)
res3 <- div(res2,2)
print(res3)

# Output
# [1] 24

Note that you can also write this by nesting the functions, since our examples are small it is still okay but imagine you have functions that take several arguments, and nesting these becomes your R code unreadable.


# Using nesting functions
res <- div(mul(add(2,4),8),2)
print(res)  

# Output
[1] 24

4. How to use Pipe Operator in R

When the Pipe operator %>% is used in an R expression or function, it passes the left-hand side of the operator to the first argument of the right-hand side of the operator. For example, x %>% f(y) converted into f(x, y) so the result from left-hand side is then “piped” into the right-hand side. This pipe can be used to write multiple operations that you can read left-to-right.

Let’s see with an example.


# Using with pipe
res <- add(2,4) %>% mul(8) %>% div(2)
print(res)

# Output
# [1] 24

4. Using Pipe with Dplyr Package

The dplyr is a package that provides a grammar of data manipulation and provides the most used verbs that help data science analysts to solve the most common data manipulation. By using methods from this package over the R base function results in better performance.

In order to use dplyr verbs, you have to install the package first using install.packages('dplyr') and load it using library(dplyr).

All verbs in dplyr package take data.frame as a first argument. When we use dplyr package, we mostly use the infix operator %>%. Let’s see with an example.


# Create DataFrame
df <- data.frame(
  id = c(10,11,12,13),
  name = c('sai','ram','deepika','sahithi'),
  gender = c('M','M','F','F'),
  dob = as.Date(c('1990-10-02','1981-3-24','1987-6-14','1985-8-16')),
  state = c('CA','NY',NA,NA),
  row.names=c('r1','r2','r3','r4')
)
df

# Load dplyr library
library('dplyr')

# filter() by row name & select id and name columns
df2 <- df %>% filter(rownames(df) == 'r3') %>% select(c('id','name'))
print(df2)

Yields below output

pipe in r

5. Limitations of using Pipe

The following are limitations of the Pipe in R.

  • It takes the output of one function and passes it into another function as an argument
  • It works with a function that takes one input.
  • If a function needs two inputs then it can’t be used.

6. Conclusion

In this article you have learned what is pipe operator in R, and how and when to use it. To express a sequence of multiple operations you can use pipe %>%. This takes the output of one function or expression and passes it to another function as an argument.

The complete example explained above is available at GitHub R Examples.

Related Articles

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium