xml2rfc icon indicating copy to clipboard operation
xml2rfc copied to clipboard

Alphabetical and Numerical Order in Indices

Open klensin opened this issue 2 years ago • 0 comments

In almost all circumstances, when <iref> elements are used to construct a document index, the index entries ("item" or "subitem") are either English words or numerals. In the generated index, either those items should be ordered as people would normally expect (i.e., case-independent and numeric) or, if there are enough important other cases, there should be a processing directive by which authors (and/or the RPC) can specify the ordering.

Our documents appear to have evolved a bit on the subject. RFC 7991 says "When the prep tool is creating index content, it collects the items in a case-sensitive fashion for both the item and subitem level" but it is not clear what the implications of "collects" might be. However, the more current 7991bis and draft-irse-draft-irse-xml2rfcv3-implemented-03 add "The index is sorted in conventional alphabetical order disregarding case." which seems perfectly clear and is consistent with my claim about correct behavior above.

However, excerpts from an index generated today show (in the text output at least):

   Terminology
        Address  Section 2.3.11, Paragraph 1
        Buffer  Section 2.3.6, Paragraph 1
        ...
        Senders and Receivers  Section 2.3.2, Paragraph 1
        State Table  Section 2.3.6, Paragraph 1
        address RR  Section 2.3.5, Paragraph 3
        primary host name  Section 2.3.5, Paragraph 4, Item 1

and

   Ticket Index
        1  Appendix H.1, Paragraph 1
        10  Appendix H.7, Paragraph 1; Appendix H.32, Paragraph 3
        11  Appendix G.7.16, Paragraph 1; Appendix H.5, Paragraph 1
        12  Appendix H.24, Paragraph 1.4.1; Appendix H.36, Paragraph
           1
        ...
        18  Appendix H.13, Paragraph 1
        19  Appendix H.9, Paragraph 1
        2  Appendix H.2, Paragraph 2
        20  Appendix H.15, Paragraph 1
        ...

And so on. The first example clearly violates the "conventional alphabetical order disregarding case." rule; the second is not consistent with what I'd consider good sense. So I assume these are bugs and that, until they are fixed, the "as implemented" document is "almost as implemented".

If a more complete example is needed, see draft-ietf-emailcore-rfc5321bis-17. Examining it also shows some fairly unfortunate examples of line wrapping, of which the "Ticket Index: 12" example above is an instance but there are several others, some arguably worse.

klensin avatar Dec 31 '22 21:12 klensin