ramp-workflow icon indicating copy to clipboard operation
ramp-workflow copied to clipboard

Parallel CV

Open rth opened this issue 3 years ago • 1 comments

It would be interesting if it was possible to run cross-validation in parallel. This was also requested by @zhangJianfeng in https://github.com/paris-saclay-cds/ramp-workflow/issues/250#issuecomment-723917761

There are two use-case here,

  • local training. E.g. for scikit-learn models this would most often be faster
  • submissions on the server, where currently resources are not optimally used. For instance to avoid CPU oversubsciption we reserve some number of CPU for each worker (via CPU affinity). Then for submissions that don't use multi-processing or threading this results in unused resources. Even for submissions that have some level of parallelism via BLAS for parts of the code, running cross-validation in parallel would likely be an improvement.

There are two potential issues,

  • currently using TensorFlow with joblib will results in CPU oversubscription because threadpoolctl is not able to limit the number of threads used https://github.com/joblib/threadpoolctl/issues/84
  • as mentioned by @albertcthomas some models might not be picklable

In any case having this as a CLI option (disabled by default) for ramp-test could be a start.

rth avatar Jun 25 '21 08:06 rth

Thanks for starting the discussion @rth. This would indeed be a very nice feature.

albertcthomas avatar Jun 25 '21 09:06 albertcthomas