Hyperactive
Hyperactive copied to clipboard
[API] folder structure and taxonomy for optimization algorithms
Following up on discussion from #121.
Mid-term, what should the folder structure for optimizations algorithm be? Including cases such as:
- the same algorithm coming from two packages (third party), potentially with different parameterization
- multiple algorithms from the same package, e.g.,
gfo
There are different taxonomies that one can apply here, two discussed were:
- semantic or algorithm type based, e.g., group multiple grid searches in one place
- dependency based, e.g., group all algos from
gfoin one place
Thanks for opening this issue. So let's frame this problem: The future goal is to provide gradient-free and gradient-based algorithms from multiple different packages, which have multiple different algorithms. I also want to include some points you made in this comment: #121
see discussion above - grid search is gradient free, so should it not also be in the gfo folder?
Until now I thought that the gfo import directory stands for the package (gradient-free-optimizers). But since we also want to provide gradient-based algorithms in the future, a separate import directory could make sense.
sklearn would mask the popular package of the same name - that is optimally to be avoided
How would it "mask" it? Mask implies hiding it in some way, but the example I provided would show the package name in the import directory:
from hyperactive.opt.sklearn import ...
There are different taxonomies that one can apply here
I would go with this one:
dependency based, e.g., group all algos from gfo in one place
Because it would make the import less confusing if we have the same algorithm from multiple packages. If we group by dependency, then its import path does the work of distinguishing the same algorithms from another. It also makes sense from an architecture point of view. If we group the same algorithms in one place the imports to their respective BaseOptimizerPackage-class (like your _BaseGFOadapter) would be all over the place. It would also communicate just by the import, which algorithm is from a package or native. This is very important, because it tells the user what to expect. If we group by dependency, it shows the algorithms from different packages as a feature. But if the algorithms are grouped together I would see the following recurring question in our future: "Why do you have the same algorithm multiple times?"
I would do the import like this:
from hyperactive.gradient_free.gfo import HillClimbing, ...
from hyperactive.gradient_free.optuna import ...
from hyperactive.gradient_free.sklearn import GridSearch
from hyperactive.gradient_free import NativeOptimizationAlgorithm
# In the future
from hyperactive.gradient_based.[package_name] import GradientDescent, ...
So the second import directory describes the problem type and the third directory names the package.
How would it "mask" it?
If the current directory is hyperactive.opt, then from sklearn import ... will result in attempting to get it from that module, not the package scikit-learn. It is maybe an edge case, but it is general good practice to not mask popular packages or python base modules (e.g., that is why you would not like to call a module warn or similar). It can lead to hard to diagnose bugs.
The same issue would arise with optuna - so I wonder whether we should sort it with "bayesian gfo" instead of just giving it the name of the package.
I think it also makes sense from a taxonomical perspective - it would show users more clearly that optuna implements a single optimization algorithm only...
I would go with this one:
dependency based, e.g., group all algos from gfo in one place
But then the sklearn one is misleading, because the optimization algorithm is not actually from sklearn, but only "the sklearn optimizer" isolated as a piece of code.
It also makes sense from an architecture point of view. If we group the same algorithms in one place the imports to their respective BaseOptimizerPackage-class (like your _BaseGFOadapter) would be all over the place.
I think this is a reasonably strong argument, yes. However, the "import surface" would still be localized, it would not be larger, just less "grouped".
But if the algorithms are grouped together I would see the following recurring question in our future: "Why do you have the same algorithm multiple times?"
I think this is a question that we want to get, because then we can answer:
"there are many popular packages which implement a single algorithm - for instance optuna implements a small family of sampling/pruning search algorithms, but you could get X also from scikit-optimize and Y from gfo. There is a lot of duplication in the ecosystem, and algorithms are refererd to by brands, not by what is actually the name of the algorithm - this can be very confusing.
We (and only we!) give users a clear overview of where they can get the algorithm they are looking for, and a clear choice of multiple popular backends - all under a single unified API. By the way, we think the gfo implementation is best in cases A, B, C."
Why is this good? Because it makes sense to "de-brand" the algorithms, ultimately what users get is not "optuna" or "scikit-optimize" or "gfo", but it is, e.g., a variant of hill climbing or sampling-pruning, and that should clearly be named.
okay, I am intrigued for another way to structure the import paths. Could you provide an example how your suggested import structure could look like?
For practicalities, I will proceed with your suggested taxonomy for now, and use sk for `sklea
We can always change it before release.
Could you provide an example how your suggested import structure could look like?
Sure! On the top level, we either cluster individual algorithms, or close families of algorithms. With current examples, it could be:
gridsearch_gridsearch_gfo.py_gridsearch_optuna.py_gridsearch_sk.py
hillclimbing_hillclimbing_gfo.py_hillclimbing_stochastic_gfo.py
Now, I realize that where this system breaks is with "bundle" interfaces like optuna, where different choices of samplers and pruners allow to produce different gfo algorithms such as grid search and random search under a single interface. I see two ways to deal with this:
optunafolder for the bundlesample_prunefor the family of algorithms, and currently_sp_optuna.pythe only member.
Based on this folder structure, how would the import look like?
This way we have a redundant "grid_search" in the import:
from hyperactive.opt.grid_search import GfoGrid_search
We also cannot do this:
from hyperactive.opt.grid_search import Gfo
Do you have a suggestion how to do this? If the optimizers are grouped by packages we could avoid this problem.
If the optimizers are grouped by packages we could avoid this problem.
No matter how we group them, I think we should give classes unique names - that is also important since various collection and retrieval utiltieis assume a unique namespace.
I would name them GridSearchSk, GridSearchGfo, or similar, and optionally, the "best" grid search considered by us, GridSearch. I think we can do this only with a grid search which takes sklearn param grids, due to how "common" this is as input.
Possible imports here would be
# in all cases, this should work
from hyperactive.opt import GridSearchSk, GridSearchGfo, etc
# in the "order by algorithm" case
from hyperactive.opt.grid_search import GridSearchSk, GridSearchGfo
No matter how we group them, I think we should give classes unique names
Agreed
If we order by algorithm, this would be my preference:
# in the "order by algorithm" case
from hyperactive.opt.grid_search import GridSearchSk, GridSearchGfo
We have the algo name two times in the import, which is not so clean. But it is better, that putting all (>30 ?) algos into hyperactive.opt.
optionally, the "best" grid search considered by us
Let's not do this. At least not for now. There should not be a "best", because there isn't mathematically. And packages or implementations can vary over time, which might shift what the best would be.
so which ordering do we go with now?
We order by algorithm, as you suggested. The package abbreviations make the class names short and the classes are unique. This should work out fine, agreed?
ok - I will make the changes in https://github.com/SimonBlanke/Hyperactive/pull/121 then
(done, please check)
V5 is released and a solution was found.