Orion icon indicating copy to clipboard operation
Orion copied to clipboard

Pipeline and hyperparameter structure improvement

Open sarahmish opened this issue 3 years ago • 0 comments

Here is a suggestion to improve the hyperparameter storage of pipelines. Currently we have a lot of files of the style pipeline_name/pipeline_name_dataset.json to denote hyperparameter changes of the pipeline for the purpose of benchmarking. To reduce the number of files in a particular folder, we suggest the following:

Under a particular pipeline folder (pipeline_name/), we can have benchmark-meta.json with all the datasets defined and their corresponding hyperparameter. The corresponding hyperparameter can be defined as:

  • a dictionary denoting the primitive and hyperparameter change
  • a path to a file that contains the hyperparameter changes, this is useful when we have a complex set of hyperparameters to be changed.

See example here

{
    "datasets": [
        {
            "name": "artificialwithanomaly",
            "hyperparameters": {
                "mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate#1": {
                    "interval": 600
                }
            }
        },
        {
            "name": "smap",
            "path": "./orion/pipelines/verified/pipeline_name/pipeline_name_smap.json"
        }
    ]
}

Pipelines and templates We technically don't differentiate between pipelines and templates using Orion API. It might even get confusing if we display pipelines and we store them as only templates. For lstm_dt and lstm pipelines (dynamic threshold, and fixed threshold respectively), I suggest having two pipeline folders to eliminate confusion. This will result in:

  • lstm_dt folder
  • lstm folder

sarahmish avatar Mar 04 '21 19:03 sarahmish