automlbenchmark
automlbenchmark copied to clipboard
Add support for NNI
AutoML toolkit for hyperparameter tuning, NAS and model compression: https://github.com/Microsoft/nni
Thanks for the pointer! Just had a brief look and it seems very focused on NAS. It does mention scikit-learn support, how suitable do you think the framework is for the tabular and structured data we feature in our benchmark? It seems some configuration is required (search space, writing a run_trial
function and a config
yaml file), are there ready-to-use presets available?
Thanks for the pointer! Just had a brief look and it seems very focused on NAS. It does mention scikit-learn support, how suitable do you think the framework is for the tabular and structured data we feature in our benchmark? It seems some configuration is required (search space, writing a
run_trial
function and aconfig
yaml file), are there ready-to-use presets available?
Yes, you are right. Search space and configuration files are required, no ready-to-use presets. In stead of using the default settings, I'm thinking maybe the automlbenchmark will need to support hyperparameter tuning later. NNI might be a good fit by then.
Okay, thanks :) In that case I'll just note down that we should document the framework somewhere, so that we can refer to it later if/when we do wish to expand our scope.
Okay, thanks :) In that case I'll just note down that we should document the framework somewhere, so that we can refer to it later if/when we do wish to expand our scope.
sounds good, thanks @PGijsbers .
@PGijsbers is this something that needs work on? I have spent some time working with NNI and happy to do a pull request with the datasets mentioned. Not done them yet, but can do it over the upcoming holidays?
Thanks for the offer @setuc! Unfortunately it is not quite clear to me what you propose here, is it any of these:
-
Create some out-of-the-box functionality for NNI so that it will search through solutions for tabular structured data? I would propose this at the NNI repository. If this functionality is wanted there and integrated, we could add NNI to the AutoML benchmark.
-
Set up NNI to work as an AutoML framework within our benchmark, by developing and configuring the required search space, configuration etc? I don't think we are interested in having that at this moment. We want to evaluate AutoML systems as they are out of the box. Requiring the specification of a search space and configuration is quite technical and actually a big part of the development of an AutoML system. We think pinning down any one specific search space/strategy would not reflect the performance of NNI, so doing that on the AutoML benchmark side is wrong, I think.
-
Documenting the framework, its intention and/or strengths and weaknesses for later reference? That would be a great contribution to have :) If you have spent some time with the framework, it would be very helpful to write a brief run-down and an example of how it could be used for solving the type of datasets we currently use in the benchmark in particular.