evalml Warm Start for Ensembles

Warm Start for Ensembles

Open ParthivNaresh opened this issue 3 years ago • 0 comments

This will involve a larger discussion on how we want to integrate this and what form the warm-start feature should take.

Currently I see three implementations that we can consider right now (that I can think of, please add or subtract):

Issue #1739
This involves a basic neural network that would be limited in its learning power but would provide some sort of probabilistic output that could help with the ensemble process. This might not even be possible, I had a hard time finding any implementation of this outside of deep learning applications, but I felt like it had to be added as an option considering how requested neural networks are after my discussion with @tyler3991 and how it would help us kickstart building the backend for supporting Keras models if we choose to push into neural networks in the future.
Leveraging meta-learning prior to implementing Bayesian Optimization. This would involve collecting more meta-features on the datasets we have, as well as collecting more datasets (especially for regression). The best machine learning hyperparameters would be saved for each of these datasets and whenever a new dataset is put through the pipeline, the datasets closest to the new one in meta-feature space would have their hyperparameters used as a warm-start. Here is the relevant paper with some key parts highlighted: https://alteryx.quip.com/vGzbAReh6NE0/AutoML-Paper

Mar 24 '21 17:03 ParthivNaresh