the-road-to-progress
the-road-to-progress copied to clipboard
Step-by-step guide for vectorizing/parallelizing your code
The Road to Progress
Step-by-step guide for vectorizing/parallelizing your R code
You can best follow this tutorial the following way: checkout the individual commits and look at the diffs. This way you'll be able to observe how the code evolved. The evolution shows the typical workflow.
![Watch the video](https://img.youtube.com/vi/uyhIiTTrTJY/mqdefault.jpg)
What you'll need
install.packages(c("pbapply", "mgcv"))
Steps
These steps demonstrate the usual workflow of how to interactively develop code and encapsulate it into a loop, then a function. This all sets us up for using vectorized functions that are well suited for parallel computing as well.
Locally with Git
Clone the repository:
git clone https://github.com/psolymos/the-road-to-progress.git
Open the repository as an R project in RStudio Desktop, VSCode, or R GUI. Check out revisions using git tags to follow the steps:
- Step 1:
git checkout 45d5a67
orgit checkout step-1
- Step 2:
git checkout 59eacb9
orgit checkout step-2
- Step 3:
git checkout da685ae
orgit checkout step-3
- Step 4:
git checkout 8321cdc
orgit checkout step-4
- Step 5:
git checkout 9fc2c61
orgit checkout step-5
- Step 6:
git checkout c0e1973
orgit checkout step-6
- Step 7:
git checkout 370432f
orgit checkout step-7
- Step 8:
git checkout 8ea4cd9
orgit checkout step-8
- Step 9a:
git checkout b6c7729
orgit checkout step-9b
- Step 9b:
git checkout db7c892
orgit checkout step-9b
The example.R
code will change along the steps, introducing new tricks.
Locally without Git
Download the zip file for this release: https://github.com/psolymos/the-road-to-progress/releases/tag/start.
Then follow along this commit history: https://github.com/psolymos/the-road-to-progress/commits/master/example.R.
In your browser with Gitpod
This link will open up a preinstalled Gitpod environment where you can run the scripts from each step by launching R and copy-pasting the contents from the step-*.R
files.
Exercise
Check out Step 4 (git checkout 8321cdc
) while creating a new branch from it: git checkout -b <new-branch-name> 8321cdc
, or dowload this release: https://github.com/psolymos/the-road-to-progress/releases/tag/middle, then
- Develop modular code by splitting the function into 2 pieces: (1) data processing + model training, and (2) prediction.
- Use
lapply
/sapply
to run the code in a vectorized fashion. - Adapt the vectorized format to show the progress and do it in parallel.
Additional topics
- Promises: the future API
- RNGs
-
foreach:
%do%
and%dopar%
- purr & map-reduce