pyGAM icon indicating copy to clipboard operation
pyGAM copied to clipboard

Poor scaling with core count?

Open conduit242 opened this issue 3 years ago • 2 comments

I have been benchmarking some modeling on my laptop with 6 cores @ 2.6Ghz vs. a 32 core machine with faster Intel platinum cores @ 2.9Ghz. PyGAM has no trouble using all the cores in either case but for some reason, the runs are only modestly faster (~25%) on the 32 core machine training on the identical data. This particular case involves training many distinct models on small sets of 2D data, each with their own gridsearch. Any idea why this might be happening?

conduit242 avatar Mar 11 '21 21:03 conduit242

Any thoughts on this @dswah?

conduit242 avatar Jul 21 '21 21:07 conduit242

@conduit242 This may be more of ramp up time than implementation. There is overhead complexity whenever you are running multiple cores. If it takes longer to resolve the compute complexity that comes along with multiple cores than it does to solve the problem, then you may not get see a speed reduction. For example, I run hundreds of distinct models that are pretty simplistic. Each of them take less than a minute to run, but it takes forever to run them sequentially. Therefore, I am multiprocessing them; all of them then can be ran in 2 minutes total time.

To do this requires you to lock the number of threads the numpy can use, this can be achieved by running the following before any other modules are imported:

import os os.environ["MKL_NUM_THREADS"] = "1" os.environ["NUMEXPR_NUM_THREADS"] = "1" os.environ["OMP_NUM_THREADS"] = "1"

You would then just need to multiprocess your models.

javier5109 avatar Mar 29 '22 20:03 javier5109