pROC icon indicating copy to clipboard operation
pROC copied to clipboard

Use doRNG and foreach for reproducible parallel bootstrapping

Open xrobin opened this issue 5 years ago • 0 comments

The plyr is old and newer, better options exist for parallel execution. The foreach package seems to be the way to go, with different backends available, and the doRNG package for reproducible parallel calculations.

Interface from the user perspective would look like:

cl <- makeCluster(2) # 2 cores
registerDoParallel(cl)
registerDoRNG(1234) 
ci(...)
stopCluster(cl)

Internally we would simply have:

resampled.values <- foreach(i=1:boot.n) %dopar% { stratified.bootstrap.test(...) }

instead of

resampled.values <- laply(1:boot.n, stratified.bootstrap.test, ...)

Things to consider:

  • Code should be able to run without any extra line of code from the user (but then not in parallel)
  • Progress bars?
  • What if some of the bootstrapping gets implemented in C++ in the future?

xrobin avatar Apr 07 '19 14:04 xrobin