xml2rfc icon indicating copy to clipboard operation
xml2rfc copied to clipboard

Artifacts from double line-breaking

Open cabo opened this issue 3 months ago • 7 comments

Describe the issue

In some cases, the renderer line-breaks some text to an imaginary width; the text renderer then again line-breaks the result of that to the actual width, while the HTML renderer generates HTML with unnecessarily pre-broken lines (and line-broken to the incorrect imaginary width) . In the text rendering this leads to introducing spaces at break points in the middle of the line.
This can be seen in www.open- std.org (note the spurious space) in the [C] reference in 0txt.

The HTML has an unnecessarily early line-break 0html. (A likely related problem can be seen with the datatracker-specific rendering, e.g. at 1.)

The PDF rendering obtained from the author tools looks right, though; this could be a coincidence.

Code of Conduct

cabo avatar Mar 29 '24 07:03 cabo

@cabo I think the issue with spaces in breakpoints of text output has been fixed since v3.20.0. EDIT: The following is from xml2rfc 3.20.1:

   [C]        International Organization for Standardization,
              "Information technology — Programming languages — C",
              Fourth Edition, ISO/IEC 9899:2018, June 2018,
              <https://www.iso.org/standard/74528.html>.  Technically
              equivalent specification text is available at
              https://web.archive.org/web/20181230041359if_/
              http://www.open-std.org/jtc1/sc22/wg14/www/abq/
              c17_updated_proposed_fdis.pdf
              (https://web.archive.org/web/20181230041359if_/
              http://www.open-std.org/jtc1/sc22/wg14/www/abq/
              c17_updated_proposed_fdis.pdf)

kesara avatar Mar 31 '24 22:03 kesara

Can you explain the unnecessary early line breaks in HTML? Because I'm not seeing any. Maybe the whole URL could have moved to the next line.

Screenshot 2024-04-01 at 12 01 09

kesara avatar Mar 31 '24 23:03 kesara

@cabo I think the issue with spaces in breakpoints of text output has been fixed since v3.20.0.

   [C]        International Organization for Standardization,
              "Information technology — Programming languages — C",
              Fourth Edition, ISO/IEC 9899:2018, June 2018,
              <https://www.iso.org/standard/74528.html>.  Technically
              equivalent specification text is available at
              https://web.archive.org/web/20181230041359if_/
              http://www.open-std.org/jtc1/sc22/wg14/www/abq/
              c17_updated_proposed_fdis.pdf
              (https://web.archive.org/web/20181230041359if_/
              http://www.open-std.org/jtc1/sc22/wg14/www/abq/
              c17_updated_proposed_fdis.pdf)

But that's what I'm seeing now: Screenshot 2024-04-01 at 06 04 09

cabo avatar Apr 01 '24 04:04 cabo

Can you explain the unnecessary early line breaks in HTML? Because I'm not seeing any. Maybe the whole URL could have moved to the next line.

Screenshot 2024-04-01 at 12 01 09

And this is a screenshot from the HTML:

Screenshot 2024-04-01 at 06 06 50

You can clearly see the early line break after open-

cabo avatar Apr 01 '24 04:04 cabo

The -04 renderings linked above apparently were made with 3.20.1, while the -03 ones (which also exhibit these weirdnesses) were made with 3.20.0. Oh, and the U+2028 that I introduced in -04 to work about the missing <br in the <annotation content model appears to be ignored in .TXT but heeded in .HTML.

(Looking at the HTML with Arc Version 1.36.0 (48035), which uses Chromium Engine Version 123.0.6312.87.)

cabo avatar Apr 01 '24 04:04 cabo

This is how https://www.ietf.org/archive/id/draft-ietf-cbor-cddl-more-control-04.html#C looks like in Safari Version 17.4.1 (19618.1.15.11.14):

image

Looks similar. Maybe this is indeed an artifact of browser line breaking preferring line-breaking after the hyphen; let's focus on the .TXT weirdness then.

cabo avatar Apr 01 '24 04:04 cabo

(This is what I get in Safari with a narrow window. Weird.)

image

cabo avatar Apr 01 '24 04:04 cabo