Update the formatting for W3C BibXML files
Currently, if I use the BibXML service to cite a W3C document I will receive the following output:
[W3C.REC-xml-20060816]
Maler, E., Ed., Yergeau, F., Ed., Paoli, J., Ed.,
Sperberg-McQueen, M., Ed., and T. Bray, Ed., "Extensible
Markup Language (XML) 1.0 (Fourth Edition)", W3C REC REC-
xml-20060816, W3C REC-xml-20060816, 16 August 2006,
<https://www.w3.org/TR/2006/REC-xml-20060816/>.
With the content of the BibXML looking something like this:
<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
<front>
<title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
<author fullname="Eve Maler" role="editor"/>
<author fullname="François Yergeau" role="editor"/>
<author fullname="Jean Paoli" role="editor"/>
<author fullname="Michael Sperberg-McQueen" role="editor"/>
<author fullname="Tim Bray" role="editor"/>
<date day="16" month="August" year="2006"/>
</front>
<seriesInfo name="W3C REC" value="REC-xml-20060816"/>
<seriesInfo name="W3C" value="REC-xml-20060816"/>
</reference>
We (the RPC) are updating our style guidance for references and would like the output of W3C references to follow something like this template for a specific version of a W3C Recommendation:
<reference anchor="cite-tag" target="URL of specific version">
<front>
<title>Title</title>
<author initials="" surname="" fullname="">
<organization />
</author>
<date day="" month="" year="" />
</front>
<refcontent>W3C Recommendation</refcontent>
<annotation>Latest version available at <eref target="URL" brackets="angle"/>.</annotation>
</reference>
This would remove the two instances of <seriesInfo> and replace them with a <refcontent> element containing the type of W3C document being referenced.
This change would also add an <annotation> element that would include the "Latest version" URL.
Please let us know if this is doable and if you need more info/clarification.
Thanks!
How do you compile the latest version?
We use the latest version URL provided by W3C. For example: https://www.w3.org/TR/xml/
So using that XML template for the W3C.REC-xml-20060816 example should look something like this:
<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
<front>
<title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
<author fullname="Eve Maler" role="editor"/>
<author fullname="François Yergeau" role="editor"/>
<author fullname="Jean Paoli" role="editor"/>
<author fullname="Michael Sperberg-McQueen" role="editor"/>
<author fullname="Tim Bray" role="editor"/>
<date day="16" month="August" year="2006"/>
</front>
<refcontent>W3C Recommendation</refcontent>
<annotation>Latest version available at <eref target="https://www.w3.org/TR/xml/" brackets="angle"/>.</annotation>
</reference>
Which should produce something like this:
[W3C.REC-xml-20060816]
Maler, E., Ed., Yergeau, F., Ed., Paoli, J., Ed.,
Sperberg-McQueen, M., Ed., and T. Bray, Ed., "Extensible
Markup Language (XML) 1.0 (Fourth Edition)", W3C
Recommendation, 16 August 2006,
<https://www.w3.org/TR/2006/REC-xml-20060816/>. Latest
version available at <https://www.w3.org/TR/xml/>.
Apologies, just realized you may have been referring to references like this:
<reference anchor="W3C.xml" target="https://www.w3.org/TR/xml/">
<front>
<title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>
<author/>
</front>
<seriesInfo name="W3C REC" value="xml"/>
<seriesInfo name="W3C" value="xml"/>
</reference>
In this case we would update like this:
<reference anchor="W3C.xml" target="https://www.w3.org/TR/xml/">
<front>
<title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>
<author/>
</front>
<refcontent>W3C Recommendation</refcontent>
</reference>
However, this isn't ideal as it still maintains a specific versions title, that is:
<title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>
Most likely, we wouldn't use that type of reference and would instead use the versioned reference with a "latest version" annotation.
For the example W3C.REC-xml-20060816:
This is the source file (in relaton): https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/rec-xml-20060816.yaml
This doesn't have any record of the latest URL. So this will require a change in relaton-w3c gem or there should be a wat that we can derive the latest URL form the existing relaton data file.
For the example W3C.xml:
This is the relaton source file: https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/xml.yaml
I believe that title is coming from W3C, so we can't change it.
But we could implement some fuzzy logic to remove text matching ( * Edition), but that can produce wrong results.
For the example
W3C.REC-xml-20060816:This is the source file (in relaton): https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/rec-xml-20060816.yaml
This doesn't have any record of the latest URL. So this will require a change in relaton-w3c gem or there should be a wat that we can derive the latest URL form the existing relaton data file.
Did some digging on the background for why we originally wanted to make this change:
W3C used to have guidance recommending the use of the "Latest version..." URL for these types of docs, but appears that they no longer have that guidance up on their site.
So it's probably not necessary to include that annotation; if anything, it's a "nice to have", but I think this format should work as well:
<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
<front>
<title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
<author fullname="Eve Maler" role="editor"/>
<author fullname="François Yergeau" role="editor"/>
<author fullname="Jean Paoli" role="editor"/>
<author fullname="Michael Sperberg-McQueen" role="editor"/>
<author fullname="Tim Bray" role="editor"/>
<date day="16" month="August" year="2006"/>
</front>
<refcontent>W3C Recommendation</refcontent>
</reference>
The most important thing is replacing those two extraneous seriesInfo elements from the what is currently in BibXML. That is:
with the refcontent element containing the type of W3C doc being referenced (i.e., the "W3C Recommendation" seen in the example above).
For the example
W3C.xml:This is the relaton source file: https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/xml.yaml
I believe that title is coming from W3C, so we can't change it. But we could implement some fuzzy logic to remove text matching
( * Edition), but that can produce wrong results.
If that's the case, don't worry to much about these versions of the W3C references. I don't think they will be used much (if at all) since I'm going to typically encourage the use of the versioned references.
@tedharrison-rpc Removing seriesInfo elements and replacinging them with refcontent can be done on BibXML service.
I assume W3C REC (docstatus: recommendation) will be W3C Recommendation.
Are there any other values that can be expected for refcontent?
I assume W3C REC (docstatus: recommendation) will be W3C Recommendation.
That's correct.
Are there any other values that can be expected for refcontent?
From a search of the repo I found the following:
W3C Working Draft = workingDraft W3C Candidate Recommendation = candidateRecommendation W3C Proposed Recommendation = proposedRecommendation W3C Proposed Edited Recommendation = proposedEditedRecommendation
These match W3C document types found here. These are also the most commonly cited, but if I notice any others that are missing I'll make sure to bring them up.
Thanks for the information @tedharrison-rpc I'll move this issue to bibxml-service because work has to be done on that repository.