Orion
Orion copied to clipboard
Pipeline and hyperparameter structure improvement
Here is a suggestion to improve the hyperparameter storage of pipelines. Currently we have a lot of files of the style pipeline_name/pipeline_name_dataset.json
to denote hyperparameter changes of the pipeline for the purpose of benchmarking. To reduce the number of files in a particular folder, we suggest the following:
Under a particular pipeline folder (pipeline_name/
), we can have benchmark-meta.json
with all the datasets defined and their corresponding hyperparameter. The corresponding hyperparameter can be defined as:
- a dictionary denoting the primitive and hyperparameter change
- a path to a file that contains the hyperparameter changes, this is useful when we have a complex set of hyperparameters to be changed.
See example here
{
"datasets": [
{
"name": "artificialwithanomaly",
"hyperparameters": {
"mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"interval": 600
}
}
},
{
"name": "smap",
"path": "./orion/pipelines/verified/pipeline_name/pipeline_name_smap.json"
}
]
}
Pipelines and templates
We technically don't differentiate between pipelines and templates using Orion API. It might even get confusing if we display pipelines and we store them as only templates. For lstm_dt
and lstm
pipelines (dynamic threshold, and fixed threshold respectively), I suggest having two pipeline folders to eliminate confusion. This will result in:
-
lstm_dt
folder -
lstm
folder