LaTeXML
LaTeXML copied to clipboard
JATS multiple authors
I wonder whether author handling could be improved by using something like the below instead of splitting by space " ".
Assuming in LaTeX:
\author{Smith, Joe \and De Los Reyes, Carlos}
Would then give
<contrib contrib-type="author">
<name>
<given-names>Joe</given-names>
<surname>Smith</surname>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<given-names>Carlos</given-names>
<surname>De Los Reyes</surname>
</name>
</contrib>
</contrib-group>
Not sure whether the LaTeX example is general enough, though (found this recommended on StackOverflow). The current setup would merge the above into
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Joe</surname>
<given-names>Smith</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Carlos</surname>
<given-names>DeLosReyes</given-names>
</name>
</contrib>
</contrib-group>
Proposed xsl:
<xsl:template match="ltx:personname">
<name>
<given-names>
<xsl:for-each select="str:tokenize(./text(),',')">
<xsl:if test="position()=last()">
<xsl:value-of select="."/>
</xsl:if>
</xsl:for-each>
</given-names>
<surname>
<xsl:for-each select="str:tokenize(./text(),',')">
<xsl:if test="position()!=last()">
<xsl:value-of select="."/>
</xsl:if>
</xsl:for-each>
</surname>
</name>
</xsl:template>
I found this proposal here: https://tex.stackexchange.com/questions/4805/whats-the-correct-use-of-author-when-multiple-authors
However, when using 0.8.6 and only a single author is present, like:
\documentclass{article}
\author{Smith, Joe F.}
\title{Author test}
\begin{document}
\maketitle
A Test.
\end{document}
The trailing dot is removed, i believe this was introduced here: https://github.com/brucemiller/LaTeXML/pull/1628
<?xml version="1.0" encoding="UTF-8"?>
<?latexml searchpaths="/home/robert/Work/ems/tex-json"?>
<?latexml class="article"?>
<?latexml RelaxNGSchema="LaTeXML"?>
<document xmlns="http://dlmf.nist.gov/LaTeXML" class="ltx_authors_1line">
<resource src="LaTeXML.css" type="text/css"/>
<resource src="ltx-article.css" type="text/css"/>
<title>Author test</title>
<creator role="author">
<personname>Smith, Joe F</personname>
</creator>
<para xml:id="p1">
<p>A Test.</p>
</para>
</document>
That's probably a separate issue of too strict sanitization?
Thinking a bit more about the original proposal above, maybe it's not possible to accommodate all ways to put an author into TeX? Maybe using <string-name> is a more secure way?
https://jats.nlm.nih.gov/publishing/tag-library/1.1/element/string-name.html
Thinking of names like
Abernathy, the Honorable Sir Edward
Sammy Davis, Jr.
there will probably always be edge cases.
Yeah, doing it in XSLT is already too late. JATS should really be working with a more BibTeX-close form, rather than trying to reverse engineer the formatted bibitem. That form does exist, if only momentarily, within LaTeXML's MakeBibliography, but (as does BibTeX) it conflates the extraction of needed bib entries with their formatting, so they never get exposed to the JATS stylesheet.
There's a complex PR #1231 which I'll be working on soon(!) and I hope that I can address preserving both the formatted & semantic forms in the process. This should allow improving the JATS bibliographies.
Oh yes, with the Bibitems it's probably even more complex, I was just thinking about the \author entries.
well, the authors are the most egregiously wrong part of LaTeXML's output :> But I assume that in the long-run, JATS wants as much of the semanic metadata as we can supply.