CalibrateEmulateSample.jl icon indicating copy to clipboard operation
CalibrateEmulateSample.jl copied to clipboard

Multithreading in Emulation/MCMC

Open odunbar opened this issue 2 years ago • 2 comments

Issue

Easy gains in MCMC, by using multithreading within each step (and calling e.g. julia --project -t 8 script.jl) . For GP (and scalar RF) implementations,

  • the prediction runs a loop over the scalar-valued models.
  • the training stage also runs a loop over the scalar-valued models. (Here it may require extra memory management)

Suggestion

  1. For MCMC, add the decorator Threads.@threads for i=1:M to the loop https://github.com/CliMA/CalibrateEmulateSample.jl/blob/bf3df405753033e852b91c19d5cb11470dfdc91f/src/GaussianProcess.jl#L197-L199 This will increase speed of prediction within MCMC by e.g. 8x

  2. For decorrelated problems, (i.e. GP and scalar RF) one can similarly train the models with such loop decorations. This will increase the speed of training by e.g. 8x

odunbar avatar Jan 18 '23 22:01 odunbar

Preliminarily from @szy21 we see that 8 threads gives only 2x speed-up to sampling in the EDMF example, I'll continue the investigation with other examples.

odunbar avatar Jan 23 '23 22:01 odunbar

Oftentimes, the downstream dependencies will greedily harness all available threads, thus calling with -t 8 and not putting in any code changes (e.g. dont add the Threads.@threads) often gives significant speedup.

odunbar avatar Apr 04 '23 23:04 odunbar