--- title: "Parallel Programming" author: "JJB + course" date: "12/07/2018" output: html_document: toc: true toc_float: collapse: false --- ```{r setup, include=FALSE} options(width = 60) knitr::opts_chunk\$set(echo = TRUE, cache = TRUE) ``` # Processing ## Example: Types of Processing There are three types of processing: 1. Serial 2. Parallel 3. Hybrid (both approaches) ## Serial Processing ## Example: Serial Example - Loop Here is a typical serial processing with a `for` loop. ```{r} n = 10000 # Number of obs x = numeric(n) # Set up storage system.time({ for(i in seq_along(x)){ # For loop x[i] = pf(i, df1 = 4, df2 = 5) } }) ``` ## Example: Serial Example - Vectorization Here is a typical serial processing with for ```{r} n = 10000 # Number of obs system.time({ x = pf(1:n, df1 = 4, df2 = 5) # Pre-vectorized }) ``` ## Example: Vectorization is sequential Vectorization will still lead to waiting for the items to be sequentially evaluated. e.g. ```{r serial_wait} # Wrapper function for use in *apply wait = function(i) { function(x){ Sys.sleep(i) } } # 10 iterations * .25 seconds = 2.5 seconds elapsed system.time({ sapply(1:10, wait(0.25)) }) ``` ## Parallel Processing ## Example: Before an example... - To work with parallel features in *R* you need to subscribe to either `doParallel` or `parallel` paradigm. - Generally, `doParallel` approach is used by novices and `parallel` is used by more advanced users. ```{r} # Load doParallel, which will load parallel require(doParallel, quiet = TRUE) # How many cores do you have? ( cores = parallel::detectCores() ) ``` ## Example: Before an example... quick parallelization. To spawn our *workers*, we need to first create a cluster. ```{r} # Start a cluster with that many cores cl = makeCluster(cores) ``` After the cluster is created, some computation normally occurs and then the cluster is stopped. ```{r} # Stop the cluster stopCluster(cl) ``` **>> Always remember to stop the cluster! <<** ## Example: Parallel Example - Loop via `foreach` Note the `for` in this case is `foreach`, which is a special looping function introduced by `doParallel`. ```{r} cl = makeCluster(3) # Create snow Cluster registerDoParallel(cl) # Register it with foreach n = 10000 # Number of obs # Use foreach in place of for to parallelize loop system.time({ out = foreach(i = 1:n, .combine=rbind) %dopar% { pf(i, df1 = 4, df2 = 5) # Note the different loop syntax } }) stopCluster(cl) # Stop the cluster ``` ## Example: Parallel Example - Vectorization Parallelized version of the vectorized `apply` function ```{r} cl = makeCluster(3) # Create snow Cluster n = 10000 # Number of obs # Use parApply to parallelize apply. system.time({ out = parSapply(cl = cl, # Cluster X = 1:n, # Data FUN = pf, # Function df1 = 4, # Function params df2 = 5) }) stopCluster(cl) # Stop the cluster ``` ## Example: Waiting for Godot... Redux Using the same `wait()` function as before, we opt to see what happens under parallelization. ```{r warning = F} # Create snow Cluster cl = makeCluster(4) system.time({ parLapply(cl = cl, X = 1:10, fun = wait(0.25)) }) # Stop the cluster stopCluster(cl) ``` ### Exercise: Distribution of Parallel Jobs - What would happen if we added a 5th worker to the cluster? - What would be the runtime of the job? - How does the job structure change? # Parallelization in R ## Example: Hello Parallel World ```{r} hello.world = function(){ print("Hello Parallel World!") } cl = makeCluster(3) # Create snow Cluster # Issue function to all threads clusterCall(cl = cl, fun = hello.world) stopCluster(cl) # Stop cluster ``` ## Example: Loading a Package on Multiple Nodes ```{r} cl = makeCluster(2) # Create cluster for snow clusterEvalQ(cl, library(cIRT)) # Load package on all nodes stopCluster(cl) # Stop cluster ``` ## Example: Exporting Values to All Nodes ```{r} cl = makeCluster(2) # Create cluster for snow x = 1:5 # Create variable on master clusterExport(cl = cl, # Send variable to slaves varlist = c("x")) stopCluster(cl) # Stop cluster ``` Note: `varlist` requires a `character` vector of expression names. ## Example: Sample *apply ```{r} # regular apply mat = matrix(1:100000, ncol=5) system.time({ apply(X = mat, MARGIN = 2, FUN = sum) }) ``` **Note:** There was a memory issue when it tried to mimic the parallelized version. ## Example: Sample Parallelized *apply ```{r} x = 1:10000000 # Data mat = matrix(x, ncol=5) # Matrix form cl = makeCluster(4) # Create snow Cluster system.time({ # Parallel Apply parApply(cl = cl, X = mat, MARGIN = 2, FUN = sum) }) stopCluster(cl) # Stop cluster ``` ## Example: `for` loops to vectorization Embarrassly simple parallelization is akin to being able to convert from a for loop to a vectorized statement. Speed ups may vary. ```{r} square_x = function(x){ # Squaring function for(i in seq_along(x)){ x[i] = x[i]^2 } return(x) } x = 1:1e7 # Data system.time({ out = square_x(x) }) ``` ## Example: `for` loops to vectorization The conversion of the prior `for` to a standard `*apply` statement would be: ```{r} system.time({ y = sapply(x, function(x) x^2) # 1-line call to apply }) ``` ## Example: Vectorization to Parallel Vectorization To go from the **standard** vectorization to a **parallelized** vectorization procedure only involves slight tweaks in the code. ```{r} cl = makeCluster(4) squared = function(x) { x^2 } system.time({ parSapply(cl = cl, X = x, FUN = squared) }) # End cluster for snow stopCluster(cl) ```