Colon character is not allowed in affiliation
See the error in https://github.com/manubot/try-manubot/pull/29 I believe the colon causes the string to be treated as a dict. Adding quotes would likely work, but we may want to support unquoted affiliations that contain a colon.
Is this possibly an issue for other author metadata fields as well?
This is just the way the YAML spec is. Note that only colon space like text: text will cause line to be treated as a dictionary. text:text would be okay.
From https://github.com/manubot/try-manubot/runs/2964985543#step:7:35
Traceback (most recent call last):
File "/usr/share/miniconda3/envs/manubot/bin/manubot", line 8, in <module>
sys.exit(main())
File "/usr/share/miniconda3/envs/manubot/lib/python3.7/site-packages/manubot/command.py", line 266, in main
function(args)
File "/usr/share/miniconda3/envs/manubot/lib/python3.7/site-packages/manubot/process/process_command.py", line 34, in cli_process
prepare_manuscript(args)
File "/usr/share/miniconda3/envs/manubot/lib/python3.7/site-packages/manubot/process/util.py", line 289, in prepare_manuscript
variables = load_variables(args)
File "/usr/share/miniconda3/envs/manubot/lib/python3.7/site-packages/manubot/process/util.py", line 206, in load_variables
add_author_affiliations(variables["manubot"])
File "/usr/share/miniconda3/envs/manubot/lib/python3.7/site-packages/manubot/process/util.py", line 133, in add_author_affiliations
affiliations = list(dict.fromkeys(affiliations)) # deduplicate
TypeError: unhashable type: 'dict'
Manubot's metadata parsing code should probably realize that the dict is not supported and issue a warning? Or perhaps we could create a JSON schema for the metadata, and if it is violated that fail, but with a clearer error message?
Perhaps _convert_field_to_list should ensure the values of the list are strings?
A good starting point could be to add a line to the docs about how the metadata is read by a YAML parser, possibly linking to YAML docs. I've seen other confusion with metadata when authors add custom fields like "conflicts" and set it to "None" or a postal code that is parsed as an int. Wrapping metadata values in quotes can be a generally useful tip when it isn't parsed as expected.
Adding a JSON schema or more specific warning messages would also be helpful. Going further and having _convert_field_to_list ensure the list values are strings should catch the most common problems users would face with the default metadata key-value pairs. "affiliations" and "funders" are the two places that have free text that could be converted to an unexpected type. The other fields (example "name" perhaps?) have more standardized values.