ramp-workflow
ramp-workflow copied to clipboard
Parallel CV
It would be interesting if it was possible to run cross-validation in parallel. This was also requested by @zhangJianfeng in https://github.com/paris-saclay-cds/ramp-workflow/issues/250#issuecomment-723917761
There are two use-case here,
- local training. E.g. for scikit-learn models this would most often be faster
- submissions on the server, where currently resources are not optimally used. For instance to avoid CPU oversubsciption we reserve some number of CPU for each worker (via CPU affinity). Then for submissions that don't use multi-processing or threading this results in unused resources. Even for submissions that have some level of parallelism via BLAS for parts of the code, running cross-validation in parallel would likely be an improvement.
There are two potential issues,
- currently using TensorFlow with joblib will results in CPU oversubscription because threadpoolctl is not able to limit the number of threads used https://github.com/joblib/threadpoolctl/issues/84
- as mentioned by @albertcthomas some models might not be picklable
In any case having this as a CLI option (disabled by default) for ramp-test
could be a start.
Thanks for starting the discussion @rth. This would indeed be a very nice feature.