Hyperactive
Hyperactive copied to clipboard
[ENH] `sklearn` compatible tuning wrapper estimator
I would suggest to expose the tuners as sklearn compatible tuning wrappers, e.g.,
HyperactiveCV(sklearn_estimator, config),
or
HyperactiveCV(sklearn_estimator, hyperopt_tuning_algo, config),
where HyperactiveCV inherits from sklearn BaseEstimator, and gets tested by parametrize_with_checks in the CI.
This is the estimator I'd use as a template: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
scikit-learn extension and API compliance testing guide:
https://scikit-learn.org/stable/developers/develop.html
I wrote the documentation for a possible design of the sklearn integration API in version 4.8 to show my current progress on this issue: https://simonblanke.github.io/hyperactive-documentation/4.8/integrations/sklearn/
Thoughts or suggestions on the progress or missing features?
Looks reasonable, although the docstring seems to imply the search space can only be a grid. Is this intended?
Is this intended?
It is. How should the search-space look like instead?
I thought, an abstract search space? In sklearn grid search, one can specify unions of grids, and in RandomSearchCV, continuous ranges or distributions even.
I did not have this on my radar, to be honest. I knew of the distributions for RandomSearchCV, though.
I would like to add some/all of those features to Hyperactive. For this I see two ways of going for those features:
- Add this "abstract" search-space feature to Hyperactive in general. If this feature is added, I can continue with the sklearn integration.
- Or release the sklearn-integration without this feature and add it in a later version.
I would go for the first way.
I did have an abstract search space whose type is specific to the tuner in mind with this: https://github.com/SimonBlanke/Hyperactive/issues/93
For now, I would suggest, supporting a minimal version makes sense, it can be later extended without breaking lower version interfaces?
For now, I would suggest, supporting a minimal version makes sense, it can be later extended without breaking lower version interfaces?
I suppose in addition to the current interface to avoid a major version change? From what I can tell this interface will be suited to act as a general optimization interface in the future and could replace the current one.
I would suggest to add this new interface as a "beta" or "experimental" feature, similar to this. It shows, that it might be subject to changes within a major version. This way we are free to refine the interface with more flexibility. If this new interface is ready to replace the current one we can release it in a new major version.
Edit: Just a small addition to your comment about the abstract search space
continuous ranges
I just want to make clear, that adding support for continuous ranges in general would be quite the challenge. Many optimization algorithms are specialized to work in discrete or continuous search spaces. The easiest approach would be to add a sampling algorithm to each discrete optimizer, so that it transforms the continuous space into a discrete one internally.
This would maybe work, but would also just "hide" the discretisation of the search space from the user.