sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

LaTeX: support CSS3 length units

Open jfbu opened this issue 8 months ago • 15 comments

Relates: #13656

cc @gmilde. Two questions

  • Have you hesitated between \paperwidth and \textwidth for vw?
  • You use g formatter, there is a slight chance this will use scientific notation on output, although very improbable but if it happens the PDF build will break. Have you estimated it is safe enough? We could follow suit.

obsolete part removed as now the test has been added in a second commit, although I don't think it will be run by our CI/LaTeX, but it will be run when Docutils 0.22 is released

a third provisory commit triggers usage of Docutils HEAD for CI/LaTeX to confirm all is fine

jfbu avatar Jun 11 '25 17:06 jfbu

Two questions

* Have you hesitated between `\paperwidth` and `\textwidth` for `vw`?

No. vw and vh describe the size of the viewport or page area: https://www.w3.org/TR/css-values-3/#viewport-relative-lengths

In single column text, textwidth == linewidth so you can use "%" to get a percentage of the textwidth.

* You use `g` formatter, there is a slight chance this will use scientific notation on output, although very improbable but if it happens the PDF build will break.  Have you estimated it is safe enough? We could follow suit.

The "g" formatter is only used with pergentage values that relate to paper size or line width. The default precision with the "g" formatter is 6. So any value 1/100000 < x < 1000000 is safe. Length values outside this range are problematic anyway.

I have a passing test on Docutils HEAD. ... Looks OK to me. I recommend trying a input sample in a test project, compiling the generated *.tex source and inspecting the result in a PDF viewer once (and with any change of the expected output).

gmilde avatar Jun 11 '25 19:06 gmilde

Looks OK to me. I recommend trying a input sample in a test project, compiling the generated *.tex source and inspecting the result in a PDF viewer once (and with any change of the expected output).

I have done the PDF inspection at my locale. It is found at a location which one can reach from the directory indicated by pytest when doing the test.

By the way I have now added a separate test and configured testing (i.e. PDF build) to only apply if with Docutils 0.22 (dev). However I am not sure our CI has a configuration for PDF builds plus Docutils HEAD, so this is theoretical so far. But when Docutils 0.22 is released our CI/LaTeX will do the test. So far I test (with success as reported above) at my locale.

I wanted to make linting happy but now ruff requires me to put all args on one line, sigh, so a force-push is pending

jfbu avatar Jun 11 '25 19:06 jfbu

In single column text, textwidth == linewidth so you can use "%" to get a percentage of the textwidth.

For legacy reason % is converted by Sphinx LaTeX into a fraction of \linewidth. I don't think this is reasonable but we would not change that prior to Sphinx 9, if ever.

jfbu avatar Jun 11 '25 19:06 jfbu

The "g" formatter is only used with pergentage values that relate to paper size or line width. The default precision with the "g" formatter is 6. So any value 1/100000 < x < 1000000 is safe. Length values outside this range are problematic anyway.

This sounds reasonable, but we have been using fixed point with either 5 or 3 decimal places and I hesitate modifying legacy code although actually the impact would be minimal and Sphinx users are not in same situation as say AMS editorial offices who want to keep PDF same as 20 years ago.

jfbu avatar Jun 11 '25 19:06 jfbu

@AA-Turner Second commit in this PR allows in principle to test that PDF build does not crash when using the new units in image dimensions, but for the test to not be skipped it would be needed for our CI/LaTeX to use the Docutils HEAD... It may not be that important to bother about this too much. I have also refactored a bit the recent (needed) addition of a second pdflatex run when testing the build of documents.

jfbu avatar Jun 11 '25 20:06 jfbu

@AA-Turner I can not enough recommend eating good quality French apple pie, this has had a tremendous effect on my cognitive abilities and I have pushed a commit so that LaTeX/CI uses Docutils HEAD. This is purely for confirmation that this PR is sane, as building PDF is essential part of checking all is fine, simply checking output mark-up would not be quite convincing enough. So, all is fine

https://github.com/sphinx-doc/sphinx/actions/runs/15617838136/job/43995053458?pr=13657#step:9:642

tests/test_builders/test_build_latex.py::test_build_latex_with_css3_lengths[pdflatex] PASSED [ 26%]
tests/test_builders/test_build_latex.py::test_build_latex_with_css3_lengths[lualatex] PASSED [ 26%]
tests/test_builders/test_build_latex.py::test_build_latex_with_css3_lengths[xelatex] PASSED [ 26%]

Before we merge this we shoudl arguably revert that latest commit (or revert it after merge). Testing LaTeX only on Docutils HEAD may not be a good idea at time of our releases.

jfbu avatar Jun 12 '25 18:06 jfbu

In single column text, textwidth == linewidth so you can use "%" to get a percentage of the textwidth.

For legacy reason % is converted by Sphinx LaTeX into a fraction of \linewidth. I don't think this is reasonable but we would not change that prior to Sphinx 9, if ever.

The x% -> x/100\linewidth conversion was chosen as the best match for the interpretation of percentage values with HTML/CSS (percentage of the containing elements dimension). I don't think changing this to x/100\textwidth would be an improvement.

gmilde avatar Jun 13 '25 07:06 gmilde

In single column text, textwidth == linewidth so you can use "%" to get a percentage of the textwidth.

For legacy reason % is converted by Sphinx LaTeX into a fraction of \linewidth. I don't think this is reasonable but we would not change that prior to Sphinx 9, if ever.

The x% -> x/100\linewidth conversion was chosen as the best match for the interpretation of percentage values with HTML/CSS (percentage of the containing elements dimension). I don't think changing this to x/100\textwidth would be an improvement.

Thanks @gmilde for your comment. See though #13661 and #13662. I said above we would not change prior to Sphinx 9, so I will now modify the milestone to Sphinx 9.x but I consider the output so bad (especially #13661) I initially had set it at 8.3.0. Maybe #13661 does require 8.3.0 though... and then #13662 will be solved at same time.

I do not think it make sense in LaTeX that image sizes would mysteriously shrink. while text font size is not modified. The HTML/CSS paradigm should not be followed too closely.

jfbu avatar Jun 13 '25 13:06 jfbu

On 13.06.2025 15:06:18, Jean-François B. wrote:

+% TODO: decide if we want rather \textwidth/\textheight. +\newdimen\sphinxvwdimen

In HTML/CSS the viewport can be fully occupied, not really in LaTeX ...

My simple test with Docutils results in just the opposite:

.. image:: /usr/share/xine/skins/xine_splash.png
   :width: 100 vw
   :height: 50Q
   :align: center
   
.. image:: /usr/share/xine/skins/xine_splash.png
   :width: 100 %
   :height: 50Q
   :align: center   

In LaTeX, I get

\noindent\makebox[\linewidth][c]{\includegraphics[height=12.5mm,width=1\paperwidth]{/usr/share/xine/skins/xine_splash.png}}

\noindent\makebox[\linewidth][c]{\includegraphics[height=12.5mm,width=1\linewidth]{/usr/share/xine/skins/xine_splash.png}}

In the PDF, the image is centered and fits exactly on the page. In HTML, the image left aligns with the text block and overflows to the right.

But apart from this, typical use will be a certain fraction of the viewport and there IMO the correct correspondence to 100% browser frame size is 100% page size, not 100% text area.


The correct translation of % to LaTeX is a related but different problem. In HTML, % is percentage of the containing block. Regarding bugs #13661 and #13662 -- With Docutils, I get comparable results for the samples in HTML and PDF. I would call this the expected result and not a bug.

gmilde avatar Jun 13 '25 22:06 gmilde

I do not think it make sense in LaTeX that image sizes would mysteriously shrink. while text font size is not modified. The HTML/CSS paradigm should not be followed too closely.

IMO we should follow CSS as closely as possible for three reasons:

a) it makes it easier to write documents that look "comparable" in HTML and PDF

b) it is easier to document -- the reStructuredText specification¹ explicitly refers to the CSS units and their definition.²

¹ https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#length-units
² the reference to CSS is "since ages", only the inclusion
  of CSS3 units is new in 0.22.

c) it is a sensible set of units for different purposes:

* fixed units (mm, pt, ...) for objects that do not "mysteriously" shrink,
* font-size based units (em, ex) for objects that grow/shrink with the font size,
* % for objects that take a certain proportion of their containing object,
* vh, vw, vmin, vmax for objects that take a certain proportion of the viewport or page.

gmilde avatar Jun 13 '25 22:06 gmilde

On 13.06.2025 15:06:18, Jean-François B. wrote: +% TODO: decide if we want rather \textwidth/\textheight. +\newdimen\sphinxvwdimen In HTML/CSS the viewport can be fully occupied, not really in LaTeX ... My simple test with Docutils results in just the opposite:

This test does not convince me because it applies uniquely to images. Further in document classes where the leftside margin width is not same as rightside margin width the rescaled image will overflow the page limits on one side. You can not apply this interface to more realistic structures because the makebox will force horizontal mode not compatible with many LaTeX structures. I don't think Docutils has any interface to put the image anywhere out of the text area, does it?

What I wanted to say is that it is theoretically possible in HTML/CSS to place things everywhere on the viewport. For LaTeX/PDF output the sole interface from a Sphinx user point of view to go outside the text area is via footnotes. Other things such as header and footers are via the document class, which is a one-shot feature.

The correct translation of % to LaTeX is a related but different problem. In HTML, % is percentage of the containing block. Regarding bugs #13661 and #13662 -- With Docutils, I get comparable results for the samples in HTML and PDF. I would call this the expected result and not a bug.

I did not understand if "comparable result" means "same as with Sphinx" or "HTML and PDF look about the same".

I do not think it make sense in LaTeX that image sizes would mysteriously shrink. while text font size is not modified. The HTML/CSS paradigm should not be followed too closely. IMO we should follow CSS as closely as possible for three reasons: a) it makes it easier to write documents that look "comparable" in HTML and PDF b) it is easier to document -- the reStructuredText specification¹ explicitly refers to the CSS units and their definition.² ¹ https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#length-units ² the reference to CSS is "since ages", only the inclusion of CSS3 units is new in 0.22. c) it is a sensible set of units for different purposes:

  • fixed units (mm, pt, ...) for objects that do not "mysteriously" shrink,
  • font-size based units (em, ex) for objects that grow/shrink with the font size,
  • % for objects that take a certain proportion of their containing object,
  • vh, vw, vmin, vmax for objects that take a certain proportion of the viewport or page.

About "container object", contrarily to what LaTeX people have said for now more than 30 years, the LaTeX mark-up corresponds in no way to a structured document (for example when your start a subsection you have absolutely no built-in way to access the "parent" section then "parent" chapter etc...; same for items in a list, same for everything you can think of basically). Else the LaTeX team would not have such a hard time into making tagging an "out-of-the-box" feature. So we have no notion of "container block" in LaTeX. The only one which comes to mind is the text area. LaTeX is not modeled on nor contains (so far) any equivalent of a Docutils document tree.

Then there is also the fact that PDFs are either printed on paper or viewed on screen. I usually set my PDF viewer to auto-enlarge the view to focus on the contents. And what do PDF viewers propose then? To enlarge the view to the width of the text area, not the paper width.

Regarding the % I have stressed in #13661 how \linewidth is a dangerous thing to use in LaTeX as it is not predictable in table cell contexts. And #13662 shows how \linewidth decrases in blockquotes or lists. There is absolutely no question in my mind that usage of \linewidth is a legacy bug, which is explained by the ignorance of the facts I have been pointing out. Anyone of \columnwidth, \textwidth, and \paperwidth is better. [^1]

About vh, vw, vmin, vmax this PR chose to follow-up the Docutils LaTeX writer choice and uses \paperwidth, \paperheight. So if user changes via preamble addition the margins, there will be impact on the image sizes relative to surrounding text (recall again that the only notion in real-life LaTeX of "surrounder" is the text area), whereas using \textwidth, \textheight could be more stable. But my opinion is not strong. Only thing I want to stress is that LaTeX predates CSS and there is nothing obvious in such decisions. It was very easy with the "classical" units, because they happen to be supported by TeX natively, with the slight problem of the "pt" having another meaning. But even "px" does not exist natively in all LaTeX engines and is a relatively recent addition to PDFTeX, which actually if I recall correctly had a bug (perhaps a mismatch with its official documentation) which I reported upstream but I have forgotten the details. But px is definetly also something whose translation into LaTeX is completely non-obvious, the upstream HTML/CSS "definition" being very far from a mathematically rigorous one to start with.

[^1]: edit I am much less assertive now, having pondered more upon the matter, and corrected a misconception I had about \linewidth in table cells.

jfbu avatar Jun 14 '25 07:06 jfbu

In HTML/CSS the viewport can be fully occupied, not really in LaTeX

... My simple test with Docutils results in just the opposite:

This test does not convince me because it applies uniquely to images. Further in document classes where the leftside margin width is not same as rightside margin width the rescaled image will overflow the page limits on one side. You can not apply this interface to more realistic structures because the makebox will force horizontal mode not compatible with many LaTeX structures.

In both, HTML and LaTeX, there are many places where 100vw is not applicable, because the available space is less than the viewport.

Could we agree on the statement: the vw and vh units are of limited use in a reStructuredText document?

The "vw" and "vh" units are supported nevertheless, because

  • they are supported "out of the box" in the main export format HTML/CSS,
  • there was a feature request, so someone found a use case,
  • it is simpler do say "the rST parser supports CSS3 units" than "the rST parser supports CSS3 units with the exception ...".

Still, I don't think re-purposing these units in the LaTeX writer is right. In contrast to percentage values, the rST specification is very clear about their meaning.

I don't think Docutils has any interface to put the image anywhere out of the text area, does it?

Not out of the box. However, in both, HTML and (Docutils) LaTeX, this can be done with a class argument and definitions in a custom stylesheet.

gmilde avatar Jun 14 '25 15:06 gmilde

The correct translation of % to LaTeX is a related but different problem. In HTML, % is percentage of the containing block. Regarding bugs #13661 and #13662 -- With Docutils, I get comparable results for the samples in HTML and PDF. I would call this the expected result and not a bug.

I did not understand if "comparable result" means "same as with Sphinx" or "HTML and PDF look about the same".

The latter. In Docutils HTML and PDF look about the same. (The main difference is, that Docutils LaTeX tables default to :widths: grid (cf. Table options).

I moved the detailled discussion to the respective issues.

gmilde avatar Jun 14 '25 15:06 gmilde

But px is definetly also something whose translation into LaTeX is completely non-obvious, the upstream HTML/CSS "definition" being very far from a mathematically rigorous one to start with.

The CSS definition changed from the idea to make this about perceived size (and hence dependent on the viewing distance) to a fixed value. In CSS3, 1 px = 3/4 pt = 1/96 in.

Unfortunately, this differs from the legacy definition in LaTeX. However, in LaTeX, the size of the pixel unit can be configured.

gmilde avatar Jun 14 '25 16:06 gmilde

But px is definetly also something whose translation into LaTeX is completely non-obvious, the upstream HTML/CSS "definition" being very far from a mathematically rigorous one to start with.

The CSS definition changed from the idea to make this about perceived size (and hence dependent on the viewing distance) to a fixed value. In CSS3, 1 px = 3/4 pt = 1/96 in.

This matches the high wisdom of the Sphinx maintainers: https://github.com/sphinx-doc/sphinx/blob/e1bd9cb3863cd1dfeaec9729dc6c842ef0f7a1f7/sphinx/builders/latex/constants.py#L77. The CSS guys did a good job ;-)

Unfortunately, this differs from the legacy definition in LaTeX. However, in LaTeX, the size of the pixel unit can be configured.

It can be configured in Sphinx too, via the 'pxunit' element whose default is quoted above. I quote at end of this the corresponding part of our LaTeX template.

Only for being completely precise, let me state that it is not a TeX/LaTeX provided dimension but is provided by the pdfTeX and LuaTeX engines. It is inherited by LaTeX when compiled with these engines. But for example it still does not exist in XeLaTeX:

$ rlwrap xelatex
This is XeTeX, Version 3.141592653-2.6-0.999997 (TeX Live 2025) (preloaded format=xelatex)
 restricted \write18 enabled.
**\relax
entering extended mode
LaTeX2e <2025-06-01>
L3 programming layer <2025-05-26>

*\message{\the\dimexpr 1px}
! Illegal unit of measure (pt inserted).
<to be read again> 
                   p
<*> \message{\the\dimexpr 1px
                             }
? X
No pages of output.
Transcript written on texput.log.

It exists with luatex engine:

$ rlwrap lualatex
This is LuaHBTeX, Version 1.22.0 (TeX Live 2025) 
 restricted system commands enabled.
**\relax
LaTeX2e <2025-06-01>
L3 programming layer <2025-05-26>

*\message{\the\dimexpr 1px}
1.00374pt
*\message{\number\dimexpr 1px}% size in "scaled points"
65781
*\stop

and pdflatex:

$ rlwrap pdflatex
This is pdfTeX, Version 3.141592653-2.6-1.40.28 (TeX Live 2025) (preloaded format=pdflatex)
 restricted \write18 enabled.
**\relax
entering extended mode
LaTeX2e <2025-06-01>
L3 programming layer <2025-05-26>

*\message{\the\dimexpr 1px}
1.00375pt
*\message{\number\dimexpr 1px}
65782
*\stop
No pages of output.
Transcript written on texput.log.

You can see it does not have the exact same value as in LuaLaTeX! One can do the following amusing experiment:

$ rlwrap pdflatex
This is pdfTeX, Version 3.141592653-2.6-1.40.28 (TeX Live 2025) (preloaded format=pdflatex)
 restricted \write18 enabled.
**\relax
entering extended mode
LaTeX2e <2025-06-01>
L3 programming layer <2025-05-26>

*\message{\the\dimexpr 1px-1bp}
0.00002pt
*\stop
No pages of output.
Transcript written on texput.log.

In fact the PDFTeX manual now says "The default value of \pdfpxdimen is 1.00001bp (for historical reasons) [...] to get precisely the same behavior in pdfTEX and LuaTEX, set \pdfpxdimen=1bp".

Apart from xelatex also uplatex does not have a native understanding of the px dimension.

At Sphinx we have been doing this in the latex template to account for all possibilities and to set the default to match CSS3: https://github.com/sphinx-doc/sphinx/blob/e1bd9cb3863cd1dfeaec9729dc6c842ef0f7a1f7/sphinx/templates/latex/latex.tex.jinja#L13-L15

jfbu avatar Jun 14 '25 16:06 jfbu

Now that DocUtils 0.22 has been released which supports CSS3 length units, I think we should incorporate this to our 8.3.0 release so I changed the milestone. I originated a few off-topic non-needed details in this thread, so it is long to read again now but I vaguely remember some points needed final decision, so when I get time I will make a final review, change what is needed, and merge in time for 8.3.0.

jfbu avatar Aug 04 '25 17:08 jfbu

I have improved some comments and will now merge after testing completes. Thanks a lot @gmilde for your previous comments. I have kept the initial choices where \linewidth is used, or \paperwidth/\paperheight as per your approval. Maybe at a later stage we can try to let the conversion from float to fixed point understood by LaTeX use the same Python formatter as in DocUtils, for now I keep the legacy Sphinx way of doing this.

jfbu avatar Aug 06 '25 09:08 jfbu

MEMO: a collateral of this patch is that LaTeX tests are done now using DocUtils HEAD.

CC @AA-Turner should I revert this usage of Docutils HEAD for LaTeX testing? I prepared a branch to do this revert, but at latest minute hesitated to merge it. I figured the rationale for reverting, which is to avoid duplicate testing would be better achieved by removing the specific "Docutils HEAD" test (which skips Latex builds) and I was hesitant to do that.

jfbu avatar Aug 06 '25 10:08 jfbu