LaTeXML
LaTeXML copied to clipboard
JATS: journal metadata not empty
When generating JATS XML, latexmlc 0.8.8 inserts not-yet-known for journal metadata:
<journal-id>not-yet-known</journal-id>
<issn>not-yet-known</issn>
...
<article-id>not-yet-known</article-id>
I see a number of downside to the behavior:
- I am not aware of any precedence for automated pipelines understanding
not-yet-known. To the extent LaTeXML should be used in automated pipelines that minimize the need for human intervention, this value will probably appear somewhere as literally a journal or article repository with the namenot-yet-known. - Leaving these values blank is very likely to have the desired semantics to any reasonably robust reader of JATS. If they are empty, the value is not known. That parse is much more likely than downstream code having
not-yet-knownhard coded.
There was nothing particularly deep about not-yet-known; probably just filler due to some validation demanding something (the element, perhaps; or non-empty or ?). I'd thought empty could mean either unknown or that there isn't a journal-id (for example). I'm fine with leaving it empty. The main point will be synchronizing it to however the journal-id is encoded into the LaTeXML XML (if, when, and however it ends up there).