fix(text): Render list items correctly
Fixes #1135 Fixes #1138
See changes made in https://github.com/ietf-tools/xml2rfc/commit/3712523cb325bb43b1b164e833dd436c1c62a295 for context.
This change can affect if ol is used inside a li as the first element and the li element is inside another ol element.
I could not find any published RFCs with XML that match the above condition.
These are the RFCs that I found that use ol as the first element in li:
- RFC8679
- RFC8687
I didn't notice any changes in text output due to this change. But there are a few minor changes (with spacing) due to changes over the years.
I think it's worth trying to change things such that generatiing
4. 4.1 Sublist with '%p' parent counter element in the list
counter, item one
4.2 Item two.
4.3 Item three.
we generate
4.
4.1 Sublist with '%p' parent counter element in the list
counter, item one
4.2 Item two.
4.3 Item three.
Unless that breaks the intended fix for the rfc-to-be.
What is better about that superfluous empty line?
(You can probably have one if you want with <br>)
Look at it in context with the rest of the nested <ol> above it in the test document: https://github.com/ietf-tools/xml2rfc/blob/fdb6fb8ad2024b3483f7480a07f8370e673c27e2/tests/valid/elements.bom.text#L366-L458 and consider what would happen if there were any text between the <li> and <ol> at https://github.com/ietf-tools/xml2rfc/blob/fdb6fb8ad2024b3483f7480a07f8370e673c27e2/tests/input/elements.xml#L472-L473
I have converted PR to a draft.
The changes from https://github.com/ietf-tools/xml2rfc/pull/1137/commits/e889f930b982c9bc9045f8a6e653f25a6464ec58 add a new line when with the list style to both ol and ul.
This was skipped previously.
Specifically, this adds list style + new line when li contains an artwork, figure, or sourcecode as the first child.
See how this changed test outputs in https://github.com/ietf-tools/xml2rfc/pull/1137/commits/ca8286a4126bc28bba095eb7e708c5a7424aadef
Note that the test output has date changes and multiple content changes propagated because of the addition of new lines.
https://github.com/ietf-tools/xml2rfc/pull/1137/commits/e889f930b982c9bc9045f8a6e653f25a6464ec58 fixes #1138
@rjsparks
Here's diffs for RFCs that has <ul> or <ol> as first element inside a <li>^1.
When empty="true" is used, this change will not affect the output of that RFC.
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8742.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8811.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8793.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8668.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8687.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc9599.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8679.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8927.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8677.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8667.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc8698.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc9195.txt
- https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc9687.txt
The above list is a bit confusing? I only see impact from this PRs change on rfc9599.
The above list is a bit confusing? I only see impact from this PRs change on rfc9599.
In most case there's empty="true" attribute set on the element or the parent level, hence no changes.
List of active I-Ds that match the crieria^1:
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-bernardos-cats-anchoring-service-mobility-01.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-bernardos-cats-anchoring-service-mobility-01.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-bernardos-cats-ip-address-anchoring-02.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-bernardos-cats-ip-address-anchoring-02.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-davis-nmop-some-refinements-to-rfc8345-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-davis-nmop-some-refinements-to-rfc8345-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-dutt-nvo3-rfc7348bis-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-dutt-nvo3-rfc7348bis-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-haynes-nfsv4-flexfiles-v2-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-haynes-nfsv4-flexfiles-v2-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-bess-evpn-fast-df-recovery-12.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-bess-evpn-fast-df-recovery-12.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-bess-rfc7432bis-10.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-bess-rfc7432bis-10.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-dnssd-srp-25.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-dnssd-srp-25.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-drip-dki-03.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-drip-dki-03.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-emu-eap-fido-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-emu-eap-fido-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-idr-bgp-generic-metric-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-idr-bgp-generic-metric-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-lamps-rfc5272bis-01.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-lamps-rfc5272bis-01.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-pce-flexible-grid-11.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-pce-flexible-grid-11.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-radext-deprecating-radius-05.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-radext-deprecating-radius-05.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-radext-tls-psk-11.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-radext-tls-psk-11.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-rats-corim-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-rats-corim-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-rift-rift-24.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-rift-rift-24.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-sidrops-aspa-slurm-02.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-sidrops-aspa-slurm-02.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-snac-simple-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-snac-simple-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-teas-nrp-scalability-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-teas-nrp-scalability-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-teas-yang-te-37.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-teas-yang-te-37.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-tls-wkech-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-tls-wkech-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-lim-apv-03.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-lim-apv-03.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-lim-rtp-apv-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-lim-rtp-apv-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-sajassi-bess-rfc8317bis-03.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-sajassi-bess-rfc8317bis-03.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-schanzen-r5n-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-schanzen-r5n-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-schmutzer-bess-bitstream-vpws-signalling-02.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-schmutzer-bess-bitstream-vpws-signalling-02.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-schmutzer-pals-ple-signaling-02.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-schmutzer-pals-ple-signaling-02.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-spaghetti-sidrops-aspa-slurm-03.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-spaghetti-sidrops-aspa-slurm-03.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-tomas-openroaming-04.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-tomas-openroaming-04.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-yan-dmm-man-14.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-yan-dmm-man-14.txt
@rjsparks, On that list I only found following I-D to have a output change because of the proposed fix: https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-haynes-nfsv4-flexfiles-v2-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-haynes-nfsv4-flexfiles-v2-00.txt
Is it clear what the authors of that draft could do to get the output they originally wanted given this change? (The source for that draft appears to be at https://github.com/loghyr/flexfilesv2)
Is it clear what the authors of that draft could do to get the output they originally wanted given this change? (The source for that draft appears to be at https://github.com/loghyr/flexfilesv2)
@rjsparks The original document with the issue RFC9599 has already published with the suggested workaround using <t>. Note this is RFC9599's text output will change slightly with this PR^1.
This is the changes for latest draft from https://github.com/loghyr/flexfilesv2:
so this would leave the generated .txt file for flexfilesv2 unchanged with this PR applied then:
<li>
<t>Delegations are assigned by the metadata server that initiates
recalls when conflicting OPENs are processed. Because I/O
operations are allowed to present delegation stateids, the
metadata server requires the ability:</t>
<ol>
<li>
to make the storage device aware of the association between
the metadata-server-chosen stateid and the filehandle and
delegation type it represents
</li>
<li>
to break such an association.
</li>
</ol>
</li>
@rjsparks yes, that will have unchanged output.
RFC 9599 kicked off this issue. This proposed update causes the text output not to match the HTML and PDF outputs as well as the original workaround did: https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc9599.txt
There were several diffs listed above that show changes to the list structure that I think the authors would not want:
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-haynes-nfsv4-flexfiles-v2-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-haynes-nfsv4-flexfiles-v2-00.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-dnssd-srp-25.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-dnssd-srp-25.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-rift-rift-24.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-rift-rift-24.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-snac-simple-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-snac-simple-06.txt
- https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-schanzen-r5n-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-schanzen-r5n-06.txt
IIUC, lists with empty="true" don't have changes in the list structure. However, the RPC evaluates the use of empty="true", and we may want to apply a more appropriate tag (see https://www.rfc-editor.org/pubprocess/how-we-update/). Would need to keep empty="true" to avoid some of these unwanted list changes?
There were several diffs listed above that show changes to the list structure that I think the authors would not want:
I agree. The output with these changes no longer reflects the list structure; a new item is created in the outer list just because one list item in there has an inner list.
RFC 9599 kicked off this issue. This proposed update causes the text output not to match the HTML and PDF outputs as well as the original workaround did: https://author-tools.ietf.org/diff?url_1=https://github.com/kesara/t4/raw/refs/heads/main/xml2rfc-1137/rfc9599.txt
There were several diffs listed above that show changes to the list structure that I think the authors would not want:
* https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-haynes-nfsv4-flexfiles-v2-00.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-haynes-nfsv4-flexfiles-v2-00.txt * https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-dnssd-srp-25.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-dnssd-srp-25.txt * https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-rift-rift-24.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-rift-rift-24.txt * https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-ietf-snac-simple-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-ietf-snac-simple-06.txt * https://author-tools.ietf.org/diff?url_1=https://www.ietf.org/archive/id/draft-schanzen-r5n-06.txt&url_2=https://github.com/kesara/t4/raw/main/xml2rfc-1137-ids/draft-schanzen-r5n-06.txtIIUC, lists with empty="true" don't have changes in the list structure. However, the RPC evaluates the use of empty="true", and we may want to apply a more appropriate tag (see https://www.rfc-editor.org/pubprocess/how-we-update/). Would need to keep empty="true" to avoid some of these unwanted list changes?
@ajeanmahoney emptry="true" will only work for ul and what it does is not setting the li bullets empty. I don't see how that can avoid any changes from this PR because it's set a parent level.
@rjsparks, @ajeanmahoney, @cabo, I think we have 3 options here:
- Keep the empty line. (Current PR does this)
- Don't keep an empty line but add the missing list style (bullet/number/char).
- Don't do anything and leave the bug as it-is.
BTW xml2rfc will probably have to add an empty line with the list style, when li contains an artwork, figure, sourcecode etc. as the first child to avoid confusion.
@rjsparks, @ajeanmahoney, @cabo, I think we have 3 alternatives here:
• Keep the empty line. (Current PR does this)
I’m not sure I know which line is empty, but maybe you are talking about the line with just a bullet that is being added? The bullet wasn’t in the source, so it shouldn’t be in the output.
• Don't keep an empty line but add the missing list style (bullet/number/char).
The bullet wasn’t in the source, so it shouldn’t be in the output. There should be exactly as many bullets as there are <li elements.
• Don't do anything and leave the bug as it-is.
(I need to look up what the bug was…)
BTW xml2rfc will probably have to add an empty line with the list style, when li contains an artwork, figure, sourcecode etc. as the first child to avoid confusion.
I’m not sure what the problem is
- int main() {}
- public static void main(String args[])
Sourcecode and artwork can be in exactly the same place where normal text can be, no newline needed.
Grüße, Carsten
@cabo: from the beginning of the PR: Fixes https://github.com/ietf-tools/xml2rfc/issues/1135 Fixes https://github.com/ietf-tools/xml2rfc/issues/1138
Note in the original report that the txt output is currently fundamentally different from the html/pdf output. If there's something to address here, it's that.
The problem in the document mentioned in #1135 was addressed the same way I suggested the flexfilesv2 doc could be repaired above, and maybe the outcome of this whole thread is "If it hurts, don't do it" with advice to put the lead in a <t> (or an empty <t> if a nested list is really a new item in the outer list. But that would leave the current issue where the txt output has a significant variance from the html unaddressed.
@rjsparks wrote:
maybe the outcome of this whole thread is "If it hurts, don't do it" with advice to put the lead in a
(or an empty if a nested list is really a new item in the outer list.
Agree.
But that would leave the current issue where the txt output has a significant variance from the html unaddressed.
I wonder about giving a warning so that the user is aware that the input yields this variance in the output files.
The examples in the referenced issues all show entirely reasonable text structures that for some reason aren't shown correctly in the plaintext form. Nothing hurts except for the bug. Please fix the bug.
This is the changes for latest draft from https://github.com/loghyr/flexfilesv2:
Well, that example is bad because the text structure doesn't reflect the text content. Indeed, the author here should remove the superfluous <li. The renderer cannot get this right, GIGO.
@rjsparks wrote:
maybe the outcome of this whole thread is "If it hurts, don't do it" with advice to put the lead in a (or an empty if a nested list is really a new item in the outer list.
@alicerusso Agree.
Also agree. I would have suggested a lead-in sentence to the authors as a workaround, but the workaround suggested by @kesara in #1135 also works. This is an obscure corner case for nested lists that isn't covered by the Chicago Manual of Style, which says to use your word processor's outline list functionality to format lists correctly. FWIW I tried to get MS Word and Google Docs to recreate the following text from RFC 9599 and couldn't:
1. approximate preservation of the presence (and therefore timing)
of congestion marks on the L2 frames used to construct an IP
packet;
2. a. at high frequency of congestion marking, approximate
preservation of the proportion of congestion marks arriving
and departing;
b. at low frequency of congestion marking, approximate
preservation of the timing of congestion marks arriving and
departing.
I'm good with the workaround here.
approximate preservation of the presence (and therefore timing) of congestion marks on the L2 frames used to construct an IP packet;
a. at high frequency of congestion marking, approximate preservation of the proportion of congestion marks arriving and departing;
b. at low frequency of congestion marking, approximate preservation of the timing of congestion marks arriving and departing.
First of all, people should be able to create exactly this nested list if it is their intention.
This is an <ol with two <li, where the second <li contains an <ol with two <li.
I have a hard time generating respect for a text formatting system that can’t do nested lists this trivial way.
Note that
approximate preservation of the presence (and therefore timing) of congestion marks on the L2 frames used to construct an IP packet;
a. at high frequency of congestion marking, approximate preservation of the proportion of congestion marks arriving and departing;
b. at low frequency of congestion marking, approximate preservation of the timing of congestion marks arriving and departing.
…also is a nested list that authors might want to create.
This is an <ol with one <li, which has a paragraph and an <ol with two <li as content.
Grüße, Carsten
@cabo Authors can create such a list with the formatting described here: https://github.com/ietf-tools/xml2rfc/issues/1135#issuecomment-2216307530
I've called it a workaround, but it's not really. Using <t/> within <li/> isn't a hack.
Using
<t/>within<li/>isn't a hack.
Well, allowing both mixed content (span-level) and t (block-level) sure is, avoiding the h word, interesting. Giving them different semantics makes my mind boggle.
One problem here is that in markdown there is no difference, which is pretty much the reason why kramdown-rfc is still generating v2 for these because the v2v3 converter seems to do something that generally works. (But not always.)
One other problem is that the content models of <t and <li are different, which creates serious headaches.
@cabo 's last comment is about the TBD RSWG(?) work on normalizing the content model for the whole grammar.
I think we close this particular issue as WONTFIX (specifically, we don't merge this proposed code change).
Closing this as a WONTFIX.