BestPractices include hyperparameter tuning

include hyperparameter tuning

Open sp8rks opened this issue 3 years ago • 1 comments

add a section on hyperparameter tuning since classical models were used with default hyperparameters

Jan 20 '22 03:01 sp8rks

Suggestion for some of the commentary in the markdown cell about hyperparameter optimization. Feel free to edit as needed.

If evaluations are very inexpensive (i.e. millions of evaluations), go with grid-based, random, or SOBOL points via e.g. sklearn.model_selection.GridSearchCV, sklearn.model_selection.RandomizedSearchCV, or skopt.sampler.Sobol, respectively. Grid-based may be good enough, but random is generally better than grid-based, and SOBOL is generally better than random. To integrate SOBOL with a CV search, see e.g. sklearn.model_selection.cross_validate
If evaluations are moderately inexpensive (i.e. tens of thousands of evaluations), go with a genetic algorithm via e.g. sklearn-genetic-opt or TPOT.
If evaluations are very expensive (i.e. hundreds of evaluations), go with Bayesian optimization via e.g. skopt.BayesSearchCV or Ax. BayesSearchCV is a more lightweight model and requires models to be optimized that match the scikit-learn estimator API. Ax has much more sophisticated Bayesian models, including automatic relevance determination (ARD) and corresponding feature importances, advanced handling of noise, and capabilities to handle high-dimensional datasets. It also has several interfaces ranging from easy-to-use to heavily customizable and is a tool that we recommend.
There may be other reasons in addition to the expense of model evaluation that can guide the choice of hyperparameter optimization scheme such as interpretability and ease of use.
In our case, due to [inexpensive/moderately expensive/expensive] model evaluations for sklearn models and to maintain a lightweight environment, we choose to use [GridSearchCV/sklearn-genetic-opt/skopt.BayesSearchCV; however, other options could have been used instead.

Jan 27 '22 22:01 sgbaird

BestPractices BestPractices copied to clipboard

include hyperparameter tuning

BestPractices
BestPractices copied to clipboard