gama icon indicating copy to clipboard operation
gama copied to clipboard

Use DASK to support caching and parallelism

Open PGijsbers opened this issue 7 years ago • 1 comments

Perhaps gama could switch from evaluating multiple pipelines single-core in parallel to evaluating one pipeline multi-core sequentially. This could lead to improved performance due to e.g. less memory use.

PGijsbers avatar Aug 31 '18 16:08 PGijsbers

Besides the general pipelines, we currently train encoding/imputation models for every pipeline. While these steps are very quick, it is still wasteful. (this can also be prevented by instead doing preprocessing once in 5-fold CV and then transfer that dataset to search - although that does not generalize to multi-fidelity techniques without adaptations.)

PGijsbers avatar Nov 18 '19 10:11 PGijsbers

Closing this because I don't think DASK is the right tool, and the issue is a bit too broad. Will open a new issue to look into pipeline caching.

PGijsbers avatar Sep 16 '22 08:09 PGijsbers