BayesFactor icon indicating copy to clipboard operation
BayesFactor copied to clipboard

Switch multi-core to parallel package

Open mariusbarth opened this issue 3 years ago • 4 comments

Hi @richarddmorey,

I started playing around with the code on my fork because I was frustrated that there was no multi-core option on Windows machines (I am working on Linux myself, but collaborating with Windows folks and sharing dynamic documents with them). I originally intended to implement a multi-core option for Windows users, and therefore switched to the parallel package that provides multi-core facilities on Windows and Unix-alikes, and is shipped with base R. Unfortunately, the parallel package only provides PSOCK clusters on Windows, and these turned out to hamper performance (compared to single-core performance). Therefore, I again switched off the multi-core option on Windows and moved to fork clusters for Unix-alikes.

While I failed my ultimate goal of providing a multi-core option for Windows users, I found that my new implementation comes with some benefits for users of unix-alikes, so I thought it might be worthwhile to adopt the changes.

  • It is faster: With a quad-core CPU, code runs faster by a magnitude of 2-3 (compared to 1.5-2 with the old implementation)
  • It is possible to provide a cluster in advance, which makes BayesFactor ready for HPC computing.
  • Dependencies to foreach and doMC are not necessary, anymore.

I don't know if I missed something important, so feel free to dismiss the code changes suggested here. ;)

Here is an example of how to run code on a pre-specified cluster:

library(parallel)
library(BayesFactor)

# Create a default cluster that can be used by anovaBF()
cl <- makeForkCluster(4L)
setDefaultCluster(cl)
getDefaultCluster()
data(puzzles)

# duplicate data set so that the computations last a bit longer:
puzzles$set <- 1
puzzles2 <- puzzles
puzzles2$set <- 2
puzzles <- rbind(puzzles, puzzles2)
puzzles$set <- factor(puzzles$set)

library(microbenchmark)

out <- microbenchmark(
  out_a <- anovaBF(RT ~ shape*color*set + ID, data = puzzles, whichRandom = "ID", progress = FALSE),
  out_b <- anovaBF(formula = RT ~ shape*color*set + ID, data = puzzles, whichRandom = "ID", progress = FALSE, multicore = TRUE),
  times = 20
)

mariusbarth avatar Jan 12 '21 10:01 mariusbarth