scanpath icon indicating copy to clipboard operation
scanpath copied to clipboard

Is there a reliable way to parallelize scasim calculations on multiple cores?

Open tmalsburg opened this issue 10 years ago • 2 comments

Last time I checked there were several methods for multi-core computations. What's the most robust way to this now?

tmalsburg avatar Feb 07 '14 11:02 tmalsburg

Short answer: I think if you're in C already and you have easily parallelized loops, parallelizing on the C side with something like OpenMP might be best (it might be as little as a single #pragma directive, including omp.h and changing your Makevars). I don't have experience with it but things like http://gallery.rcpp.org/articles/dmvnorm_arma/ seem amazingly simple.

Longer answer: I think there's still no great cross-platform solution, and still depends on what kind of parallellization you want (local vs cluster and parallel _ply vs something more cleverly split). parallel has been shipping with R since 2.14 and provides two high-level interfaces (mclapply() and the par_applyfamily) butmclapply doesn't work on windows and the others have a bit more setup (you basically need to set up and manage a "cluster" using sockets or forks even for local work -- doesn't seem hard but was too fiddly for me). It also provides all the lower level process management stuff that supports the above.

Personally, I really like foreach with various %dopar% back ends because it's more flexible than the *ply stuff in parallel but without the extra setup. But last I checked there was not a single parallel back end that worked cross platform.

Hope this helps! Depending on what exactly my next step is after this summer, I might be interested in poking at this further.

mshvartsman avatar Jun 02 '14 18:06 mshvartsman

Thanks for the explanations. I'll probably go with %dopar%. I guess that the code should use only one process by default. Otherwise the user might end up with too many processes, when scasim is used in a script that is parallelized itself. The best solution is probably to have a parameter in the scasim function that controls the number of processes but defaults to one.

tmalsburg avatar Jun 09 '14 14:06 tmalsburg