How to run R for loop in parallel? In any programming, loops are used to execute the statements repeatedly. As your code within the loop gets complex, the loops may take longer hence, to solve this problem you can run each iteration of the loop in parallel.
Following are the steps to run R for loop in parallel
- Step 1: Install foreach package
- Step 2: Load foreach package into R
- Step 3: Use foreach() statement
- Step 4: Install and load doParallel package
Let’s execute these steps and run an example.
Step 1- Install foreach package
In order to run R for loop in parallel, you need to use foreach() statement from the foreach package. This doesn’t come with the default install of R hence, you need to install it first.
#Install foreach package
install.packages("foreach")
Step 2 – Load and initialize the foreach library
Load foreach
library using library(foreach)
#Load foreach library
library(foreach)
Step 3 – Use foreach
The foreach()
returns a list with the results. You also need to use %do%
operator after the loop definition
#Load foreach library
x <- foreach(i = 1:20) %do% {
sqrt(i)
}
x
To change the return type from list to vector, you can use the .combine
argument of foreach
to arrange the list as a vector.
Similarly, you can also use options such as cbind
, rbind
, or even custom functions can be used as well to change the return type of foreach().
#Load foreach library
x <- foreach(i = 1:20, .combine=cbind) %do% {
sqrt(i)
}
x
Step 4: Running foreach loops in Parallel
The foreach
loop with operator %do%
explained above processes the tasks or each iteration sequentially. In order to run in parallel you have to use foreach
with operator %dopar%
. And, you also need to install and load the library doParallel
.
In order to run in parallel, you need to create a cluster with the processors or cores on your server or laptop.
library(doParallel)
#Setup backend to use many processors
totalCores = detectCores()
#Leave one core to avoid overload your computer
cluster <- makeCluster(totalCores[1]-1)
registerDoParallel(cluster)
Now run the foreach in parallel
library(foreach)
#Run forloop in Parallel
x <- foreach(i = 1:20 .combine=cbind) %dopar% {
sqrt(i)
}
x
#Stop cluster
stopCluster(cluster)
If you get the below error when you run your program, you might have allocated too many processes, try running again by reducing the codes
Error in serialize(data, node$con) : error writing to connection
Finally, stop the cluster.
#Stop cluster
stopCluster(cluster)
Conclusion
In this article, you have learned how to run R for loop in parallel by using foreach
and operator %dopar%
. Also, learned to run in parallel first you need to create the cluster with the cores you wanted to run. The number of cores defines how many parallel iterations you wanted to run.
Related Articles
- Looping in R (for, while, repeat) With Examples
- Repeat Loop in R
- For Loop in R with Examples
- While Loop in R with Examples
- Nested For Loop in R
- Break and Next (Continue) Statements in R
- R Using For in Range with Example
- R if, if…else, if…else…if Usage with Examples
- R if…else with Multiple Conditions
- R ifelse() Function