sl3
sl3 copied to clipboard
Multicore processing in future being deprecated in rstudio
Hey there -- for the parallelization vignette, this may be important news
Actually yeah this is a big deal -- I swapped from multicore
to multisession
and got a 25x speedup (I think multicore is already falling back on "sequential"). I was wondering why things were going so slowly despite using the parallelization framework...
Good to know. Did you get the warning mentioned in that future issue when you tried multicore in RStudio?
Yup.
I'm also wondering -- in the vignette, 12 was chosen as the number of logical cores... But if its spinning up a bunch of threads, is there any reason not to spin up like 100 threads (or sessions in this case) and just let the CPU handle them all as it sees fit? This way, it'll stay at 100% utilization all the time. I imagine the answer may be "scheduler overhead" but limiting to 12 seems to reduce total CPU utilization below 100%
Can confirm that setting up 30 workers on my 12 thread laptop seems to be a solid move for maxing utilization
The number of threads was based on the machine used to run the vignette.
That said, in my experience, i've found that even hyperthreading (using 2 threads/core) either has no effect or can hurt performance over using 1 thread/core. Also, ML in R is almost always memory bound and not CPU bound, unless you have a machine built specifically to have a ton of memory relative to the number of cores. You'll see that kernel_task process in your screenshot, which can indicate swapping memory onto your disk when you run out of physical memory. That's is going to hurt your performance a lot. Pegging your CPU at 100% does not always result in the shortest run times. YMMV, feel free to play around and benchmark different numbers of threads for your specific compute resources and application.
Very fair argument surrounding being memory-limited. I think I avoided swapping because my data isn't too large, but would agree that's generally the rate-limiting factor relative to CPU usage. Totally agree its a YMMV situation -- may be worth mentioning the tradeoff in the vignette.