bibxml-service icon indicating copy to clipboard operation
bibxml-service copied to clipboard

Update the formatting for W3C BibXML files

Open tedharrison-rpc opened this issue 10 months ago • 9 comments

Currently, if I use the BibXML service to cite a W3C document I will receive the following output:

   [W3C.REC-xml-20060816]
              Maler, E., Ed., Yergeau, F., Ed., Paoli, J., Ed.,
              Sperberg-McQueen, M., Ed., and T. Bray, Ed., "Extensible
              Markup Language (XML) 1.0 (Fourth Edition)", W3C REC REC-
              xml-20060816, W3C REC-xml-20060816, 16 August 2006,
              <https://www.w3.org/TR/2006/REC-xml-20060816/>.

With the content of the BibXML looking something like this:

<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
  <front>
    <title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
    <author fullname="Eve Maler" role="editor"/>
    <author fullname="François Yergeau" role="editor"/>
    <author fullname="Jean Paoli" role="editor"/>
    <author fullname="Michael Sperberg-McQueen" role="editor"/>
    <author fullname="Tim Bray" role="editor"/>
    <date day="16" month="August" year="2006"/>
  </front>
  <seriesInfo name="W3C REC" value="REC-xml-20060816"/>
  <seriesInfo name="W3C" value="REC-xml-20060816"/>
</reference>

We (the RPC) are updating our style guidance for references and would like the output of W3C references to follow something like this template for a specific version of a W3C Recommendation:

<reference anchor="cite-tag" target="URL of specific version">
     <front>
       <title>Title</title>
       <author initials="" surname="" fullname="">
         <organization />
       </author>
       <date day="" month="" year="" />
     </front>
     <refcontent>W3C Recommendation</refcontent>
     <annotation>Latest version available at <eref target="URL" brackets="angle"/>.</annotation>
   </reference>

This would remove the two instances of <seriesInfo> and replace them with a <refcontent> element containing the type of W3C document being referenced.

This change would also add an <annotation> element that would include the "Latest version" URL.

Please let us know if this is doable and if you need more info/clarification.

Thanks!

tedharrison-rpc avatar Feb 25 '25 19:02 tedharrison-rpc

How do you compile the latest version?

kesara avatar Feb 25 '25 21:02 kesara

We use the latest version URL provided by W3C. For example: https://www.w3.org/TR/xml/

So using that XML template for the W3C.REC-xml-20060816 example should look something like this:

<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
     <front>
       <title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
       <author fullname="Eve Maler" role="editor"/>
       <author fullname="François Yergeau" role="editor"/>
       <author fullname="Jean Paoli" role="editor"/>
       <author fullname="Michael Sperberg-McQueen" role="editor"/>
       <author fullname="Tim Bray" role="editor"/>
       <date day="16" month="August" year="2006"/>
     </front>
     <refcontent>W3C Recommendation</refcontent>
     <annotation>Latest version available at <eref target="https://www.w3.org/TR/xml/" brackets="angle"/>.</annotation>
   </reference>

Which should produce something like this:

  [W3C.REC-xml-20060816]
              Maler, E., Ed., Yergeau, F., Ed., Paoli, J., Ed.,
              Sperberg-McQueen, M., Ed., and T. Bray, Ed., "Extensible
              Markup Language (XML) 1.0 (Fourth Edition)", W3C
              Recommendation, 16 August 2006,
              <https://www.w3.org/TR/2006/REC-xml-20060816/>.  Latest
              version available at <https://www.w3.org/TR/xml/>.

tedharrison-rpc avatar Feb 26 '25 14:02 tedharrison-rpc

Apologies, just realized you may have been referring to references like this:

<reference anchor="W3C.xml" target="https://www.w3.org/TR/xml/">
  <front>
    <title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>
    <author/>
  </front>
  <seriesInfo name="W3C REC" value="xml"/>
  <seriesInfo name="W3C" value="xml"/>
</reference>

In this case we would update like this:

<reference anchor="W3C.xml" target="https://www.w3.org/TR/xml/">
  <front>
    <title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>
    <author/>
  </front>
  <refcontent>W3C Recommendation</refcontent>
</reference>

However, this isn't ideal as it still maintains a specific versions title, that is:

    <title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title>

Most likely, we wouldn't use that type of reference and would instead use the versioned reference with a "latest version" annotation.

tedharrison-rpc avatar Feb 26 '25 17:02 tedharrison-rpc

For the example W3C.REC-xml-20060816:

This is the source file (in relaton): https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/rec-xml-20060816.yaml

This doesn't have any record of the latest URL. So this will require a change in relaton-w3c gem or there should be a wat that we can derive the latest URL form the existing relaton data file.

kesara avatar Feb 26 '25 20:02 kesara

For the example W3C.xml:

This is the relaton source file: https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/xml.yaml

I believe that title is coming from W3C, so we can't change it. But we could implement some fuzzy logic to remove text matching ( * Edition), but that can produce wrong results.

kesara avatar Feb 26 '25 20:02 kesara

For the example W3C.REC-xml-20060816:

This is the source file (in relaton): https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/rec-xml-20060816.yaml

This doesn't have any record of the latest URL. So this will require a change in relaton-w3c gem or there should be a wat that we can derive the latest URL form the existing relaton data file.

Did some digging on the background for why we originally wanted to make this change:

W3C used to have guidance recommending the use of the "Latest version..." URL for these types of docs, but appears that they no longer have that guidance up on their site.

So it's probably not necessary to include that annotation; if anything, it's a "nice to have", but I think this format should work as well:

<reference anchor="W3C.REC-xml-20060816" target="https://www.w3.org/TR/2006/REC-xml-20060816/">
     <front>
       <title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
       <author fullname="Eve Maler" role="editor"/>
       <author fullname="François Yergeau" role="editor"/>
       <author fullname="Jean Paoli" role="editor"/>
       <author fullname="Michael Sperberg-McQueen" role="editor"/>
       <author fullname="Tim Bray" role="editor"/>
       <date day="16" month="August" year="2006"/>
     </front>
     <refcontent>W3C Recommendation</refcontent>
   </reference>

The most important thing is replacing those two extraneous seriesInfo elements from the what is currently in BibXML. That is:

with the refcontent element containing the type of W3C doc being referenced (i.e., the "W3C Recommendation" seen in the example above).

For the example W3C.xml:

This is the relaton source file: https://github.com/ietf-tools/relaton-data-w3c/blob/main/data/xml.yaml

I believe that title is coming from W3C, so we can't change it. But we could implement some fuzzy logic to remove text matching ( * Edition), but that can produce wrong results.

If that's the case, don't worry to much about these versions of the W3C references. I don't think they will be used much (if at all) since I'm going to typically encourage the use of the versioned references.

tedharrison-rpc avatar Feb 28 '25 17:02 tedharrison-rpc

@tedharrison-rpc Removing seriesInfo elements and replacinging them with refcontent can be done on BibXML service.

I assume W3C REC (docstatus: recommendation) will be W3C Recommendation. Are there any other values that can be expected for refcontent?

kesara avatar Mar 04 '25 19:03 kesara

I assume W3C REC (docstatus: recommendation) will be W3C Recommendation.

That's correct.

Are there any other values that can be expected for refcontent?

From a search of the repo I found the following:

W3C Working Draft = workingDraft W3C Candidate Recommendation = candidateRecommendation W3C Proposed Recommendation = proposedRecommendation W3C Proposed Edited Recommendation = proposedEditedRecommendation

These match W3C document types found here. These are also the most commonly cited, but if I notice any others that are missing I'll make sure to bring them up.

tedharrison-rpc avatar Mar 07 '25 15:03 tedharrison-rpc

Thanks for the information @tedharrison-rpc I'll move this issue to bibxml-service because work has to be done on that repository.

kesara avatar Mar 10 '25 02:03 kesara