LaTeXML
LaTeXML copied to clipboard
hyperref and hyperxmp metadata improvements
Add remaining \hypersetup keywords and add xml:lang if pdfmetalang or pdflang were specified (and LaTeXML should probably use pdflang as document language if not already specified elsewhere). Note that this also fixes the wrong mapping of pdfsubject – it should be dcterms:description not dcterms:subject.
The metadata could then be used elsewhere (e.g. JATS #2354, EPUB), but this PR is only about preserving the data in the XML output.
Note that the implementation is incomplete: certain properties (such as authors) are implemented as lists by hyperxmp (<rdf:Seq><rdf:li>..., sometimes <rdf:Bag><rdf:li>..., or <rdf:Alt><rdf:li> for alternative languages). I haven't touched any of that as LaTeXML implements RDF essentially as strings.
This all looks plausible; I don't use XMP myself, but this should probably help those who do. @dginev do you have any thoughts on this? (other than not being an XMP fan, as I recall :> )
I have a classic comment: it would be nice to have a test, which will have the additional effect of self-documenting how these XMP enhancements can be used by authors. I haven't used hyperxmp.sty myself, so I trust @xworld21 knows more than me here.
I am also new to PRISM, but so far I find it palatable - there is a W3C Member Submission with a reasonable W3C Team Comment.
One observation while quickly skimming that, the PRISM Namespaces section lists the URI added in this PR as a basic: prefix, as there appear to be another ~10 other PRISM-related namespace URIs. Maybe we want to qualify that in latexml as prismbasic: / pbasic: or such?
it would be nice to have a test
Excellent idea, and it's making me discover details I'd missed (e.g. only some properties respect pdfmetalang, others support a language by prefixing [en]).
Is it ok if I include the sample document included in the hyperxmp documentation? Would there be license issues with that?