[API] design for generic optimizer
From our earlier discussion.
I would design a generic interface as follows:
- there are two (interface) classes, the
BaseOptimizerand theBaseExperiment(orBaseEvaluatoretc). Both inherit fromskbaseBaseObject, so provide a dataclass-like, sklearn-like composable interface.- in particular,
__init__args always must be explicit, and never use positional or kwargs. - the
skbasetag system can be used to collect all the tags, e.g., from GFO things like the type of optimizer (particle etc), or whether it is computationally expensive, or soft dependencies required for it.
- in particular,
- the
BaseExperimenthas ascoremethod, it has the same signature as your "model" currently; its__call__also redirects toscore, so it can be used with the current signature. That's the "basic" interface, but we could also add an interface for gradients, to also cover gradient-based optimizers!- an subclass of
BaseExperimentcould, for instance, be: evaluating ansklearnclassifier by cv on a dataset, so it could beSklearnExperiment(my_randomforest, X, y, KFold(5).
- an subclass of
- the
BaseOptimizerhas__init__, which passes parameters only, andadd_search, which has almost the current signature - it takes aBaseExperimentdescendant instance, and one more object which configures the search space. Search behaviour liken_iterwould not be passed inadd_search, but should be an__init__arg.- to execute the search, I would suggest a
fitmethod, as that would be compliant with multiple API naming choices, though I would not mindrunoroptimizeetc. This method sets attributes to self, ending in_, wo they are visible viaget_fitted_params
- to execute the search, I would suggest a
Thoughts?
PS: I'm happy to try write this if you would like me to try? Not right now due to being busy, but maybe early Oct.
Hello @fkiraly,
I took some time to understand these changes you are proposing. I will show you how interpreted them, so please correct me if I misunderstood something.
It appears, that you want to change the API of Hyperactive, so that it is possible to use different optimization backends. This also necessitates to implement an interface (Experiment), that is adapted to certain optimizers. For example: a optimizer that uses gradients also requires an experimental setup, that supports gradients.
I would be open to the possibility to optionally select other optimization backends for the experiment.
so it could be SklearnExperiment(my_randomforest, X, y, KFold(5)
I do not understand this example, because it would already be covered by the sklearn integration. A separate experiment-class for each package (sklearn, xgboost, pytorch) would heavily decrease the flexibility of the interface.
I would suggest a fit method, as that would be compliant with multiple API naming choices
Hyperactive does not fit an estimator at that point in the api. It runs the optimization setup. The fit-method makes sense in the sklearn integration.
A separate experiment-class for each package (sklearn, xgboost, pytorch) would heavily decrease the flexibility of the interface.
This would be used only for adaptation inside the sklearn adapter. The optimizer optimises the experiment.
You would need at least one experiment per package or unified API, no? But not one per unified API and optimizer.
Hyperactive does not fit an estimator at that point in the api.
I just mean, why not call optimize, instead, fit. It is just a naming question, since fit is used so often for data ingestion of any kind.
I think there is a small degree of miscommunication - would you like me to write a design document, or a draft PR (for demo purpose only)?
I think there is a small degree of miscommunication - would you like me to write a design document, or a draft PR (for demo purpose only)?
That would be great! :-)
Partially implemented here - feedback appreciated!
https://github.com/SimonBlanke/Hyperactive/pull/95
Relevant comment: https://github.com/SimonBlanke/Hyperactive/issues/85#issuecomment-2454119030
This issue was solved during the development of v5.