Bug with SNTYPES
In the terminal the dictionary for --sntypes has to specify all types that are defined in the preset, even if some types are not present in the dataset.
A minimal example leading to the issue:
snn make_data --raw_dir <path/to/raw/data> --dump_dir <path/to/dump> --sntypes '{"112":"Ib/c", "113":"Ia"}'
In this example <path/to/raw/data> contains the raw files, which themselves only encapsulate the two classes 112 and 113.
The output will contain the following:
[Missing sntypes] ['111' '115' '212' '214' '131' '221' '114' '132' '134' '133' '213' '211' '123' '124'] binary tagged as class 1
Further down the line, training will lead to a ValueError. That is, calling the following breaks the training:
snn train_rnn --raw_dir <path/to/raw/data> --dump_dir <path/to/dump> --sntypes '{"112":"Ib/c", "113":"Ia"}'
Database creation as well as training work when passing the following --sntypes:
--sntypes '{"112":"Ib/c","113":"Ia","111":"NaN","115":"NaN","212":"NaN","214":"NaN","131":"NaN","221":"NaN","114":"NaN","132":"NaN","134":"NaN","133":"NaN","213":"NaN","211":"NaN","123":"NaN","124":"NaN"}'
Additionally, when constructing the dataset in python one has to call the following to make sure the code does not break:
command_arg = "make_data"
args = conf.get_args(command_arg)
args.dump_dir = <path/to/dump>
args.raw_dir = <path/to/raw/data>
args.sntypes = dict([ #NOTE: all classes need to be defined explicitly!
("112", "Ib/c"), ("113", "Ia"),
("115", "NaN"), ("111", "NaN"), ("212", "NaN"), ("221", "NaN"), ("214", "NaN"), ("124", "NaN"), ("131", "NaN"), ("211", "NaN"), ("132", "NaN"), ("114", "NaN"), ("134", "NaN"), ("213", "NaN"), ("123", "NaN"), ("133", "NaN"), ("122", "NaN"),
])
settings = conf.get_settings(command_arg, args=args)
make_dataset.make_dataset(settings)
Thanks @TheRedElement I can't reproduce the error, can you provide the raw data file? thanks