neuralmonkey
neuralmonkey copied to clipboard
Series IDs get somehow lowercased during configuration loading
I found no place in our code that may cause it. I guess it is the INI parser which pre-processes the field names.
Do you mean that s_UP=... is interpreted as s_up=...?
From Python's ConfigParser documentation:
As we can see above, the API is pretty straightforward. The only bit of magic involves the DEFAULT section which provides default values for all other sections [1]. Note also that keys in sections are case-insensitive and stored in lowercase [1].
https://docs.python.org/3/library/configparser.html#quick-start
I suggest to document this and leave it as it is.
If we decide we want to allow case-sensitive keys, there is an option: https://docs.python.org/3/library/configparser.html#configparser.optionxform
Or we could abandon (or at least deprecate) the **kwargs hack in load_dataset_from_files and use something like series_in=[("UpperCase", "path/to/file"), ...] and series_out=[("UpperCaseOut", "path/to/file"), ...].
We could also extend the ini syntax to enable linebreaks inside brackets, so this can be written in an elegant way.
If you are really willing to extend the syntax in this way, I would prefer getting rid the **kwargs like this, it would come in handy in other situations like listing postprocessors. Otherwise, I would stick to the current state, the in-line lists would only make the config clumsy.
Solution to this issue: Fix error messages to contain the set of defined series ids:
Error: unknown series ID: "SoUrCe". Possible IDs are: ["source", "target"]