automlbenchmark Custom functions: e.g. custom metrics

adding generic support for frameworks extensions with an example of integration for custom metrics.

Extensions are loaded by default from {user}/extensions.py. User can define as many extension files as he wants in his config.yaml, e.g.

extensions_files: 
   -  '{user}/autosklearn_extensions.py'
   -  '{user}/autogluon_extensions.py'
   -  '{user}/tpot_extensions.py'
   -  '{user}/extensions.py'

a specific variable, function or class defined in those files can be loaded by name from the framework integration using:

from frameworks.shared.callee import get_extension
...
ext = get_extension(config.extensions, 'foo')

The first name that could be loaded succesfully will be returned. For example, if 'foo' is defined on all the extensions files above, and we're running TPOT, then the first 2 files may fail loading in TPOT integration if they import autogluon or autosklearn modules, however loading {user}/tpot_extensions.py should succeed, so if there's a variable/function named "foo" there, it will be returned, otherwise it will look into the last file.

This PR shows how this can be used to inject custom metrics in frameworks like autogluon, autosklearn and TPOT: https://github.com/openml/automlbenchmark/issues/127 for example if user overrides default metrics in his config.yaml:

benchmarks:
   metrics:
     multiclass: ['foo', 'logloss']

then it is then possible to define the 'foo' custom metric in the extensions files, and it will then be used by frameworks that accept it.

May 29 '20 23:05 sebhrusen

TODO: add description to HOWTO documentation.

Jul 20 '20 18:07 sebhrusen

It looks like the metric is not reported in the results, so I set up #173. Other than that, as far as I can tell it looks good (only tested with TPOT though). The documentation (the message in the PR) was good enough for me to set up my custom metric*.

* I did run into a small issue where the amlb framework provides the y as (N,1)-vector, which the TPOT custom metric example could not deal with. The subsequent error from tpot (incorrect data) was then a bit puzzling considering the baked-in scikit-learn metrics work (presumably because they strip the second dimension automatically if required.

Oct 07 '20 14:10 PGijsbers

comment by seb from #173:

There's a difficulty here. If we take TPOT as an example, then the training and the result processing are not done on the same process/venv, this means that if the tpot_extensions.py has a dependency on tpot module, then we will be able to load the extension in the training process, but it will fail in the one processing results...

The lookup logic suggested in #141 is then probably too simple. Maybe we need something like:

extensions_files: autosklearn: '{user}/autosklearn_extensions.py' autogluon: '{user}/autogluon_extensions.py' tpot: ['{user}/tpot_extensions.py', '{user}/sklearn_extensions.py'] default: '{user}/extensions.py'

Now let's imagine that the tpot_extensions.py can be only be loaded in the tpot venv. Then, we can still provide extensions for the results that only require sklearn: to report the custom metrics in the results, the requirement is that it relies only on sklearn, numpy, pandas, scipy..., that is the dependencies available in the main process.

I think you can still merge this PR, I will probably consider this to improve #141 though.

Is this a thing we expect to worry about in practice? I suppose automl packages could have their own metrics predefined to import, but if that is the case a scikit-learn fallback would not be enough either.

Oct 13 '20 09:10 PGijsbers

@PGijsbers

I suppose automl packages could have their own metrics predefined to import, but if that is the case a scikit-learn fallback would not be enough either.

I was trying to make the distinction between the custom metric used by the framework for optimization (this one may be provided by the automl package), and the one that we compute ex post from the predictions.csv that is used to compare with other frameworks, and for this one the user should be able to implement it only with the core libraries, it's just a math formula after all.

Oct 13 '20 18:10 sebhrusen

for this one the user should be able to implement it only with the core libraries, it's just a math formula after all.

Exactly. And this is what the current implementation already facilitates as far as I am aware?

Oct 13 '20 18:10 PGijsbers

automlbenchmark automlbenchmark copied to clipboard

Custom functions: e.g. custom metrics

automlbenchmark
automlbenchmark copied to clipboard