BestPractices
BestPractices copied to clipboard
include hyperparameter tuning
add a section on hyperparameter tuning since classical models were used with default hyperparameters
Suggestion for some of the commentary in the markdown cell about hyperparameter optimization. Feel free to edit as needed.
- If evaluations are very inexpensive (i.e. millions of evaluations), go with grid-based, random, or SOBOL points via e.g. sklearn.model_selection.GridSearchCV, sklearn.model_selection.RandomizedSearchCV, or skopt.sampler.Sobol, respectively. Grid-based may be good enough, but random is generally better than grid-based, and SOBOL is generally better than random. To integrate SOBOL with a CV search, see e.g. sklearn.model_selection.cross_validate
- If evaluations are moderately inexpensive (i.e. tens of thousands of evaluations), go with a genetic algorithm via e.g. sklearn-genetic-opt or TPOT.
- If evaluations are very expensive (i.e. hundreds of evaluations), go with Bayesian optimization via e.g. skopt.BayesSearchCV or Ax. BayesSearchCV is a more lightweight model and requires models to be optimized that match the
scikit-learn
estimator
API. Ax has much more sophisticated Bayesian models, including automatic relevance determination (ARD) and corresponding feature importances, advanced handling of noise, and capabilities to handle high-dimensional datasets. It also has several interfaces ranging from easy-to-use to heavily customizable and is a tool that we recommend. - There may be other reasons in addition to the expense of model evaluation that can guide the choice of hyperparameter optimization scheme such as interpretability and ease of use.
- In our case, due to [inexpensive/moderately expensive/expensive] model evaluations for
sklearn
models and to maintain a lightweight environment, we choose to use [GridSearchCV/sklearn-genetic-opt/skopt.BayesSearchCV; however, other options could have been used instead.