sphinx icon indicating copy to clipboard operation
sphinx copied to clipboard

Support block-specific style overrides

Open hmedina opened this issue 9 months ago • 20 comments

Purpose

Allow local overrides of the Pygments style, on a per-code-block basis.

This work extends the directives for code-block, sourcecode, literalinclude and code, by introducing two new options, style-light and style-dark. As the app parses the document and discovers these, the builders collects the information. This allows Pygments to produce CSS or STY files (for HTML & LaTeX) for the relevant classes. These are moved from the initializing phase of the builder, to the finalizing phase.

Special attention was paid to light & dark modes. For most builders (e.g. LaTeX), this distinction is ignored, with only the light style having any effect; this mirrors how currently, the configuration value pygments_dark_style has no effect on such builders. For the HTML family builders, if the theme supports a dark style, then the pygments_dark.css file gains the appropriate selectors, independently of what happens to the pygments.css file for light styles. Since in the general sense, a user need not track the overrides for light style in the same fashion as for the dark style, the selectors are generated by hashing the docutils node. In a CPython implementation, this returns a pointer; as the docutils hierarchy is a tree, a pointer to the code-block appears sufficient for these purposes, as collisions should not happen; this said, I'm open to suggestions. The single-style builders, like LaTeX, use a simpler tracking for their command prefixes (used similarly as CSS selectors in the STY files).

The files have been linted with Ruff.

The new options are tested in the test suite. I'm unsure how to incorporate more advanced testing for other parts, so I welcome feedback (should I add tests to the builders?).

References

  • Discussion (with examples!) : https://github.com/orgs/sphinx-doc/discussions/13524
  • https://github.com/sphinx-doc/sphinx/issues/9105#issuecomment-821844520

hmedina avatar Jun 02 '25 23:06 hmedina

The issue with the pypy test concerns the usage in sphinx\highlighting.py l.236, where I pass a list of strings to the Pygments function.

This is valid use of said function html.py#L516, so I am unsure how to tell mypy about this; it appears to be confused and checking against the base class formatter.py, which uses arg='', instead of the HTML formatter, which uses arg=None

hmedina avatar Jun 03 '25 03:06 hmedina

All the tests are passing, the documentation is updated, and there's no visible issues left, so I don't have any more work planned on this. If there's something missing, or there's concerns, I'm happy to work / address them; any feedback would be welcome. As I mention in the Discussion linked in the first post, I have a project that requires this functionality

hmedina avatar Jun 11 '25 22:06 hmedina

Thanks for the feedback!

Sanitize LaTeX-hostile names

I had not considered that Python names can be very invalid LaTeX macro elements. For testing, I installed locally a LaTeX-hostile style called _-_, which SetupTools & Pygments handle just fine. This led me to realize that there's no reason why the name property in a pygments.style class must match whatever was declared in the entry point... So I refactored to use the user-written value internally / on the Python side, and only use a "sanitized" version when producing LaTeX code, for .sty or .tex files. That way errors (e.g. key errors) yield something the user would know about, as it appears in their .rst files.

For sanitization, since LaTeX characters can be moved around categories, a general "is this character safe" determination can apparently only be done at compile time... so I sanitize things into a-zA-Z range, replacing other characters with a Z. The _-_ style thus becomes ZZZ.

The namedefs got: \@namedef{PYGZZZ@tok@w}{\def\PYGZZZ@tc##1{\textcolor[rgb]{0.73,0.73,0.73}{##1}}} while the other defs got: \def\PYGZZZZbs{\text\textbackslash}

This at least seems to behave (and not break) the LaTeX compilation and yielded a valid PDF in my testing.

How to deal with line-breaking?

However, the issue with the line-breaks remains. I'm not familiar enough with TeX macros, but it seems to me that what's missing is the functionality declared for the vanilla PYG macros, in https://github.com/sphinx-doc/sphinx/blob/e1bd9cb3863cd1dfeaec9729dc6c842ef0f7a1f7/sphinx/texinputs/sphinxlatexliterals.sty#L561-L650

Could we regain the functionality with some macro aliasing? Unless I'm misunderstanding, the TeX let could allow something like \PYGZZZZti behave like \PYGZti. Are the break-friendly characters those defined in highlighting.py/_LATEX_ADD_STYLES ?

hmedina avatar Jun 14 '25 09:06 hmedina

The namedefs got: \@namedef{PYGZZZ@tok@w}{\def\PYGZZZ@tc##1{\textcolor[rgb]{0.73,0.73,0.73}{##1}}} while the other defs got: \def\PYGZZZZbs{\text\textbackslash}

This at least seems to behave (and not break) the LaTeX compilation and yielded a valid PDF in my testing.

Yes it will work. I did not check the commit per se but indeed all you need to do is to pass a sanitized command prefix to Pygments, like using the Z's as you do.

For the info I have right now succeeded to accomodate passing to Pygments `f'PYG;{style};' as command prefix. This needs some non-trivial TeX macro layer but is related to your next point.

How to deal with line-breaking?

However, the issue with the line-breaks remains. I'm not familiar enough with TeX macros, but it seems to me that what's missing is the functionality declared for the vanilla PYG macros, in

Yes this is thorny TeX coding there. \spx@verb@wrapPYG@i does an \ifx check t osee if next is \PYG. The test could still be positive if say it is \PYGautumn as long as \PYGautumn has "same meaning" as \PYG. The simple approach will not have that because we do want the final PDF output to be style specific. This is why I wanted a new mark-up which would not have the style name embedded in the macro. I have now completed a solution based upon \PYG;stylename; mark-up, all is fine via suitable catcode changes in the style sheets and exploiting that ; is "catcode active" in Verbatim, but I need to update now \spx@verb@wrapPYG@i and akin obscure TeX code in sphinxlatexliterals.sty. Perhaps once I will have re-examined closely I will see that finally I can also solve this sticking to the Z approach. I am starting thing of things but the issue here is that Pygments uses the command prefix also for some special mark-up (say \PYGZbs). Ok, I think I see a way which would make my one-hour work on getting a workable approach with ; go to trash. But then you should use W not Z as replacement letter. no but I need to think a bit more.

jfbu avatar Jun 14 '25 10:06 jfbu

How to deal with line-breaking?

However, the issue with the line-breaks remains. I'm not familiar enough with TeX macros, but it seems to me that what's missing is the functionality declared for the vanilla PYG macros, in

https://github.com/sphinx-doc/sphinx/blob/e1bd9cb3863cd1dfeaec9729dc6c842ef0f7a1f7/sphinx/texinputs/sphinxlatexliterals.sty#L561-L650

Could we regain the functionality with some macro aliasing? Unless I'm misunderstanding, the TeX let could allow something like \PYGZZZZti behave like \PYGZti. Are the break-friendly characters those defined in highlighting.py/_LATEX_ADD_STYLES ?

The above quoted TeX definition is not related to \PYGZti and alike escapes (for the latter see https://github.com/sphinx-doc/sphinx/pull/13611#discussion_r2146819239). By the way, the break-friendly characters indeed includes them but also non-escaped ones (\sphinxbreaksatactive). As per the quoted code above its aim to replace mark up such as \PYG{k}{def} by, in effect, \PYG{k}{d}\PYG{k}{e}\PYG{k}{f}. The argument def can itself contain in reality some nested \PYG so it is not obvious.

We will be able to solve the problem of compatibility with verbatimforcewraps in the custom blokcs in the following manner: we will arrange for the sphinxVerbatim environment to access the implemented Pygments style, for example by inserting \def\sphinxpygmentsstyle{<whatever>} at suitable location. In the data ending up in sphinxhighlight.sty in the LaTeX build directory all included styles will have the same top definition \protected\def\PYGautumn{\spx@PYG{\sphinxpygmentsstyle}}, \protected\def\PYGcolorful{\spx@PYG{\sphinxpygmentsstyle}}, etc, for each encountered one, including the default \protected\def\PYG{...}. This way the \ifx\spx@nexttoken\PYG test in the current \spx@verb@wrapPYG@i will be positive. In this manner the problem will be solved with no adapatation whatsoever to sphinxlatexliterals.sty but we need a small extra in visit_literal_block to inject the \def\sphinxpygmentsstyle{<whatever>} or perhaps inject it inside the option to Verbatim. Anyway, this is easier to do in a second stage. We have to be careful about some extensions or tools (nbconvert comes to mind) that it will be transparent to them.

Things would be easier if we could control more what Pygments outputs in the LaTeX stylesheet, and the output of its latex formatter, but we will manage at Sphinx level.

jfbu avatar Jun 14 '25 12:06 jfbu

@hmedina I have implemented my remarks to achieve full LaTeX support at https://github.com/jfbu/sphinx/tree/multistyle.

I think at some places your docstrings are a bit long and should be hardwrapped for shorter line lengths. I am not too happy about using twice and not only once md5, but it works. I checked the wrapping of long codelines inclusive of verbatimforcewraps=true inside 'sphinxsetup' value as key of latex_elements and all works perfectly. As indicated in a TODO comment perhaps at a later stage some slight change will be made. I may also consider passing the sanitized stylename rather as option to \fvset as the way hllines is handled. Maybe I will refactor a bit the way the sanitized stylename is passed over to the sphinxVerbatim. But it does work at this stage. By the way I observed some breakage in PDF builds for the parent commit (current HEAD of this branch) with longer code-blocks than I had been using so far, so we have to be careful to test this with long code blocks. Which I did at my locale.

Anyway, do you mind if I push that commit to this branch?

jfbu avatar Jun 15 '25 13:06 jfbu

@jfbu hey there! I do not mind at all if you push your work to this branch; I'm very happy the full feature set is kept.

Once you merge into this PR, I'll take a look at the docstrings; I kept them at the Ruff setting (95 characters?), but if a shorter length is preferable I'll shorten them, just let me know to what length.

As for the MD5 hashing, it might be appropriate to replace the two calls and create a utility function just for providing "sanitized" LaTeX macro names; there's some functions in sphinx/util/texescape.py, a file that's already imported on both the sphinx/writers/latex.py and sphinx/highlighting.py; if that sounds like a good idea, I can do that tweak.

For the PDF breaking, what kind of break are you seeing? I just tried with the ~270 lines of highlighting.py and I got no errors, nor visible issues with the PDF

Edit: a couple make clean later, I'm seeing an issue with long code blocks; they're missing in the .tex file

hmedina avatar Jun 16 '25 20:06 hmedina

@jfbu hey there! I do not mind at all if you push your work to this branch; I'm very happy the full feature set is kept.

This is done. I will need to investigate why the RTD build test fails. (Maybe this will be a bit later as I have tasks ; the same commit on my clone passed all tests).

Once you merge into this PR, I'll take a look at the docstrings; I kept them at the Ruff setting (95 characters?), but if a shorter length is preferable I'll shorten them, just let me know to what length.

I personnally have a preference for 80 or even 78 charactes for line lengths in docstrings, but maybe we don't enforce it here. Maybe wait for a more global review of this PR. Things may need some attention like referring to builders from the LaTeX writer.

As for the MD5 hashing, it might be appropriate to replace the two calls and create a utility function just for providing "sanitized" LaTeX macro names; there's some functions in sphinx/util/texescape.py, a file that's already imported on both the sphinx/writers/latex.py and sphinx/highlighting.py; if that sounds like a good idea, I can do that tweak.

Yes another approach which I had considered avoids hashing altogether. It simply enumerates all the encountered specialized block styles, and output them as a string acceptable to LaTeX in command names i.e. only a-zA-Z with the numeric index 0, 1, ... converted to a suitable ascii-letter representation. The problem is that we have to make sure the prefix chosen will have no clash with existing style names.

In my Python virtual environment I have about 70 Pygments styles and I checked the md5 truncated to 6 hexadecimals were unique. Using md5 solves our problem and keeps the produced sanitized tex names to a controlled length.

For the PDF breaking, what kind of break are you seeing? I just tried with the ~270 lines of highlighting.py and I got no errors, nor visible issues with the PDF

I see

! Missing \endcsname inserted.
<to be read again> 
                   \futurelet 
l.113 ...a}\PYGcolorful{l+s+s2}{\PYGcolorfulZdq{}}

with HEAD at 7fab81e97741e26a015478b1d34f015fdfab5639 and my test index.rst which is a bit long, and anyhow all is fine with the new HEAD of this PR at 4b09baa72ef2bd63e0c3cefadc0374f22a9d7eff.

Edit: a couple make clean later, I'm seeing an issue with long code blocks; they're missing in the .tex file

Can you test with the new HEAD at 4b09baa72ef2bd63e0c3cefadc0374f22a9d7eff?

jfbu avatar Jun 17 '25 07:06 jfbu

Testing failed apparentely because CI could not fetch HEAD after my push. No idea why and can't investigate for now.

jfbu avatar Jun 17 '25 07:06 jfbu

@hmedina I have merged master with was without conflicts (done by Ort strategy I did not look too closely if merge was sane). I was eager to see if CI testing would succeed and it seems to turn out good (attow only CI/Windows has not yet completed). Some glitch happened earlier for unknown reason at the commit I contributed prior.

jfbu avatar Jun 17 '25 11:06 jfbu

Yes another approach which I had considered avoids hashing altogether. It simply enumerates all the encountered specialized block styles, and output them as a string acceptable to LaTeX in command names i.e. only a-zA-Z with the numeric index 0, 1, ... converted to a suitable ascii-letter representation. The problem is that we have to make sure the prefix chosen will have no clash with existing style names.

Now that we let Pygments use only 'PYG' prefix in the actual LaTeX, we can also use @ for the command prefix which will be specific to the stylessheets. The stylesheets all end up in .sty file which Sphinx loads by \RequirePackage so we can use @ as letter in LaTeX commands; but Pygments somewhat strangely inserts \makeatletter/makeatother (hence in _LATEX_ADD_STYLES we also have too), because it is very old code with very old legacy choices, in that case to allow using \input in place of \usepackage/\RequirePackage, which makes sense if other TeX formats than LaTeX are targeted, except that \makeatletter is a LaTeX only command so its insertion is self-defeating.

Anyhow what matters is that we can use @ now that such command prefix is not used in the produced .tex file but only in the accessory stylesheets.

So a possibility avoiding the md5 method would to enumerate all specialized styles and assign same sanitized names of the stype 'foo@' for the first one, 'foo@@' for the second, 'foo@@', foo@@@, etc... (where foo is anything for which we are sure \PYGfoo@ macro exists nowhere already, aaa should be fine) No Pygments style I know of uses @ character anyhow in its name so we are certain there will be no collision. Besides, when LaTeX will encounter \def\sphinxpygmentsstylename{foo@...@} it will assign to these @'s catcode other, but that does not matter because the \sphinxpygmentsstylename is expanded inside \csname...\endcsname (see _LATEX_ADD_STYLES in highlighting.py) and there @ can be either "letter" or "other", both work.

The difference though with this is that we have to pass the information to visit_literal_block of what is the index of that block stylename, so that the number of @ (at least 1) to use is known.

jfbu avatar Jun 17 '25 12:06 jfbu

Sorry there, my 4b09baa72ef2bd63e0c3cefadc0374f22a9d7eff had a botched revert of changes in _LATEX_ADD_STYLES, the removal of {override} forgot to account for the fact that {{ and }} all have to revert to { and } now that this string is not destined to be used with .format. So this left some extra curly braces and in particular the lines

\def\PYGZob{{\{{}}
\def\PYGZcb{{\}}}}

were very wrong with the result that \PYGZob was defined with strange meaning to TeX {\{{}} \def \PYGZcb {{\}}}... oops... (somewhat miraculously the lines did not cause a PDF crash as \PYGZcb already had a defintion, only some extra space character ending up in PDF...).

I have corrected this oversight at the third commit I pushed here (https://github.com/sphinx-doc/sphinx/pull/13611/commits/27b5239189ebcb41b704b89ae85cf07375b27f5d). I know those commits might disappear as we will need at one point to rebase on master for the merge to be feasible.

jfbu avatar Jun 17 '25 13:06 jfbu

Hey there! Seems to be working mostly, however I noticed the background colors, for LaTeX builds, only appear for the main/default style; none of the local overrides have it (try with a default style of python_docs_theme and a block of :style-light: nord for an immediately visible issue). Looking at the .sty file, I only see the \colorbox being set for the very first PYG@tok@cs, none of the subsequent ones have that (they only get a \textcolor). It seems this may be an issue in Pygments itself (see pygments/pygments#1054 ), as the command-line pygmentize also doesn't return background color values...

The HTML builder does not suffer from this, apparently because everything got a class selector with either .highlight or the .c[hash] element.

Left: HTML build. Right: LaTeX build.

left: HTML build, right: LaTeX build

As for using @ in style names, SetupTools will not refuse such a name (when quoted), so styles with that character in them definitely can be installed and used. This said, I'm not sure there's any character that would be a safe delimiter, so if there's no general solution, I agree this usage seems good enough as it covers all the stock Pygments themes.

I'm going to try a couple work-around for the background color issue, and try to do some code cleanup. Anything else while I'm at it?

hmedina avatar Jun 17 '25 23:06 hmedina

The issue about background colors is a Pygments issue. Their output makes zero provision for setting a background color. The explanation is probably than until very recently the target LaTeX package fancyvrb had no interface for that (it is the one providing the Verbatim environment one sees in Pygments output, but we use sphinxVerbatim which is a sophisticated layer on top of it adding features), or it had something with faulty output. I think there is now better support (no time nor interest to check now details). But we at Sphinx would not want to use that, we have all that is needed to set a background color correctly. The only issue here is to get somehow the information of what is the background color expected for the block by the style. Once we have it we can enforce it very easily. Here is a temporary user level work around where I see an arbitrarily chosen background color:

.. raw:: latex

   \sphinxsetup{VerbatimColor=black!80}

.. code-block::
   :style-light: nord

   def get_stylesheet(self, selectors: list[int] | str | None = None) -> str:
       """Return a string with the specification for the tokens yielded by the language lexer, appropriate for the output formatter, using the style defined at initialization. In an HTML context, `selectors` is a list of CSS class selectors. In a LaTeX context, it modifies the command prefix used for macro definitions; see also LaTeXBuilder.add_block_style()
       """
       formatter = self.get_formatter()
       if isinstance(formatter, HtmlFormatter):
           if selectors:
               return formatter.get_style_defs(['.c{}'.format(s) for s in selectors])  # type: ignore [no-untyped-call]
           else:
               return formatter.get_style_defs('.highlight')  # type: ignore [no-untyped-call]
       else:
           if selectors:
               if isinstance(selectors, str):
                   _tex_name = md5(selectors.encode()).hexdigest()[:6]
                   for d, l in [('0', 'G'), ('1', 'H'), ('2', 'I'), ('3', 'J'),
                                ('4', 'K'), ('5', 'L'), ('6', 'M'), ('7', 'N'),
                                ('8', 'O')]:
                       _tex_name = _tex_name.replace(d, l)

renders to PDF (using :style-light: nord) as: Capture d’écran 2025-06-18 à 08 58 20

(note that above code-block for example the docstring is modified from original to test specific things; besides it was missing ('9', 'P') and ruff forced me into another syntax later on).

The background color you see in non especially styled blocks is not the one from the pygments_style, it is a default of Sphinx set via the VerbatimColor key of the 'sphinxsetup' key of the latex_elements conf value. See https://www.sphinx-doc.org/en/master/latex.html#the-sphinxsetup-configuration-setting.

VerbatimColor

The background color for code-blocks.

Default: {RGB}{242,242,242} (same as {gray}{0.95}).

Changed in version 6.0.0: Formerly, it was {rgb}{1,1,1} (white).

About

.. raw:: latex

   \sphinxsetup{VerbatimColor=black!80}

one has to reuse it after the code-block to reset another color. Of course, if we implement this at Sphinx level, we would prefer locate the change inside the LaTeX environment. Which however is not trivial due to actual structure of visit_literal_block(), for example for setting the local style in this PR I opted temporarily to passing the information from outside the sphinxVerbatim and canceling it afterwards. At some point in future this may need refactoring to make it in a nicer way.

I'm going to try a couple work-around for the background color issue, and try to do some code cleanup. Anything else while I'm at it?

Sorry for late reply but I recommend you don't devote energy to this as part of this PR, you can do it in a separate PR if you have the motivation. I don't see anything else regarding LaTeX/PDF until other maintainers check the structure of how information is exchanged between builder and writer.

jfbu avatar Jun 18 '25 07:06 jfbu

copied pasted from earlier now hidden review

As per the handling of dark-mode, it could be possible to do it, I have actually done it for a project using Sphinx, this requires modifying the Verbatim background color which can be done by \sphinxsetup anywhere inside the document, something such as (with the choices I made for the colors)

\sphinxsetup{%
    pre_background-TeXcolor={RGB}{40,42,54},% #282a36 aka VerbatimColor
    pre_TeXcolor={RGB}{248,248,242},% #f8f8f2
    VerbatimHighlightColor=VerbatimColor!75,
}

I also needed to add pygments_style = pygments_dark_style in the conf.py.

It is needed to set the default color for text adequately, as the above choices do, so that un-highlighted tokens show on dark background.

The issue for us is only to get from the Pygments style the info about what it wants to set as background color (and ideally default text color) for the framed block.

\textcolor LaTeX command will serve nothing to us in such context.

jfbu avatar Jun 18 '25 07:06 jfbu

The issue for us is only to get from the Pygments style the info about what it wants to set as background color (and ideally default text color) for the framed block.

The Pygments.Style class has a background_color attribute: https://github.com/pygments/pygments/blob/94dda77d69a6d6c47c33f06ce2425e7f306154a2/pygments/style.py#L170-L174

As for the default text color, assuming there's one, that would inherit from the Token key in the styles attribute (in Pygments, all tokens inherit from this class). An annotation in that string like bg:#fff would signify a background color for those characters.

Not sure how to translate the values into viable latex, but for example, modifying the final part of LaTeXBuilder.write_stylesheet()

if self.specialized_highlighters:
    specialized_styles = []
    for style_name, pyg_bridge in self.specialized_highlighters.items():
        specialized_style = '\n% Stylesheet for style {}'.format(style_name)
        specialized_style += pyg_bridge.get_stylesheet(style_name)
        print('For style {}:'.format(style_name))
        if pyg_bridge.get_style(style_name).background_color is not None:
            bc = pyg_bridge.get_style(style_name).background_color.lstrip('#')
            bc_rgb = '{:.2f},{:.2f},{:.2f}'.format(*[int(bc[i:i+2], 16)/255 for i in (0, 2, 4)])
            print('\tgeneral background color was: {} -> {}'.format(bc, bc_rgb))
            from pygments.token import Token
            base_style = pyg_bridge.get_style(style_name).styles[Token]
            if base_style:  # could look like 'italic #000 bg:#ffffff'
                match = re.match(r'#([0-9a-fA-F]{3,6})(?:\s+bg:#([0-9a-fA-F]{3,6}))?', base_style)
                text_color_rgb = '{:.2f},{:.2f},{:.2f}'.format(*[int(match.group(1)[i:i+2], 16)/255 for i in (0, 2, 4)])
                print('\tdefault text color was: {} -> {}'.format(match.group(1), text_color_rgb))
                if match.group(2):
                    text_back_rgb = '{:.2f},{:.2f},{:.2f}'.format(*[int(match.group(2)[i:i+2], 16)/255 for i in (0, 2, 4)])
                    print('\tdefault text background color was: {} -> {}'.format(match.group(2), text_back_rgb))
        specialized_styles.append(specialized_style)
    f.write('\n'.join(specialized_styles))

My local testing gets this in output:

processing sphinxmultistylesample.tex: done
writing... done
For style lovelace:
        general background color was: ffffff -> 1.00,1.00,1.00
For style xcode:
        general background color was: ffffff -> 1.00,1.00,1.00
For style solarized-light:
        general background color was: fdf6e3 -> 0.99,0.96,0.89
        default text color was: 657b83 -> 0.40,0.48,0.51
For style nord:
        general background color was: 2E3440 -> 0.18,0.20,0.25
        default text color was: d8dee9 -> 0.85,0.87,0.91
Writing evaluated template result to /home/hmedina/sphinx_multistyle_sample_project/_build/latex/sphinxmessages.sty
build succeeded.

What would be the right way of translating this into background colors? I tried something with the pyg@toc@tc line's use of \colorbox but didn't get much...

NB: I guess both text-color and background-color could be optional, so the if match.group should be more rigorous and check both separately.

hmedina avatar Jun 19 '25 06:06 hmedina

(I will be offline for some time so quick reply here)

What would be the right way of translating this into background colors? I tried something with the @.***@tc line's use of \colorbox but didn't get much...

You can't set it from any of the mark-up set in the stylesheet. The way to proceed is for the Sphinx latex.py writer to inject a suitable \sphinxsetup setting VerbatimColor. Check my recent comments. A second \sphinxsetup will need to reset the original. VerbatimColor, I can take care of that later on as this may need some small add-on to sphinx.sty.

By the way using RGB not rgb you can use integer values from 0 to 255 for each component and this is better than rgb with decimals. And you can simply use {HTML}{6 hexadecimal digits}, which may be the simplest.

jfbu avatar Jun 19 '25 06:06 jfbu

I have implemented following on your comment LaTeX/PDF support for background_color and default color for non-highlighted tokens. I will push to your branch once tests pass, feel free of course to revert or improve. And test. Which I don't have much time for, these days.

jfbu avatar Jun 20 '25 13:06 jfbu

@hmedina I have rebased on top of current master (ultimately if merged, it will have to be rebased anyhow). I resolved conflicts as I could, and checked that the final result tree was exactly identical with what one gets from merging current master (at c384ab96c328c3a) into this branch (this merge has no conflicts), apart from an ordering of items in CHANGES.rst. You could fetch the rebased branch from my jfbu/multistyle_rebased branch and then force push it here. This way commits are sort of preserved (I simply removed a pair consisting of a commit immediately followed by its revert). The alternative is to checkout the current branch after having merged current master, then reset the git HEAD to master, then make a single commit which will thus have the exact correct tree but this way one would lose an indication of the fact that we both contributed... (not to mention loss of all commit messages)

jfbu avatar Jun 20 '25 20:06 jfbu

Ok, I'll rebase and push here as you suggest. I'm currently traveling, so I'll get back to it next week.

On Fri, Jun 20, 2025, 16:06 Jean-François B. @.***> wrote:

jfbu left a comment (sphinx-doc/sphinx#13611) https://github.com/sphinx-doc/sphinx/pull/13611#issuecomment-2992667794

@hmedina https://github.com/hmedina I have rebased on top of current master (ultimately if merged, it will have to be rebased anyhow). I resolved conflicts as I could, and checked that the final result tree was exactly identical with what one gets from merging current master (at c384ab9 https://github.com/sphinx-doc/sphinx/commit/c384ab96c328c3abc91b39c9e3e527863df30af1) into this branch (this merge has no conflicts), apart from an ordering of items in CHANGES.rst. You could fetch the rebased branch from my jfbu/multistyle_rebased branch https://github.com/jfbu/sphinx/tree/multistyle_rebased and the force push it here. This way commits are sort of preserved (I simply removed a pair consisting of a commit immediately followed by its revert). The alternative is to checkout the current branch after having merged current master, then reset the git HEAD to master, then make a single commit which will thus have the exact correct tree but this way one would lose an indication of the fact that we both contributed...

— Reply to this email directly, view it on GitHub https://github.com/sphinx-doc/sphinx/pull/13611#issuecomment-2992667794, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMD2SP4WJXU25MAMJRTLXL3ERSWHAVCNFSM6AAAAAB6OHDEZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSOJSGY3DONZZGQ . You are receiving this because you were mentioned.Message ID: @.***>

hmedina avatar Jun 21 '25 04:06 hmedina

Hey there, I'm a bit confused; I did not understand your suggestion. Since both your branch and this PR's have commits in common, specifically the head commit, cherry-picking from your head's (e5f0ba1a5) to this PR's (37fabbe) yields an empty / no-changes commit; so I must be misunderstanding your suggestion. I'm not that familiar with using Git in such a collaborative space, so a more verbose explanation would be appreciated.

In any case, both jfbu/sphinx/multistyle_rebased and this PR's master branch are currently out-of-date and require updating with the latest commits from the sphinx/sphinx-doc/master branch; I'm fine rebasing this PR's branch to keep the history more linear, and that should preserve our respective commits (and their commit messages)

hmedina avatar Jun 24 '25 01:06 hmedina

Process note: please don't rebase. Instead, cherry-pick or merge.

General note: I apologise for not having time to review, but I am not entirely convinced of the need for this feature. As far as I can tell, it hasn't been requested before, and the use-case here is for both a custom/private language and a custom (non-Pygments) lexer. This introduces a fair amount of complexity whilst we're undertaking a large refactoring project to reduce inter-linking in the internals of Sphinx.

Please respond to the general feedback in the discussion issue so that we can keep the PR to just discussing the implementation, the rationale & wider design proposal can stay in the issue.

A

AA-Turner avatar Jun 24 '25 02:06 AA-Turner

About some of the ToDos in sphinx/highlighting.py from @jfbu

Background color but no text color

Two styles that define a background color but do not assign a text color to the root Token are manni and gruvbox-light. Neither has invisible text; both appear to behave as expected, with black text: image

Pros & Cons of forcing white backgrounds to gray

While I agree the light-gray background looks better, I would prefer to respect the theme's choice of background. Some text colors may have been chosen with that white background in mind, and swapping it to mild-gray will reduce the contrast, rendering the text possibly difficult to read. Some people may even use styles with white backgrounds to avoid ink waste while printing

Token & its background color

The root token from Pygments can have a bg:#123 color specified. Both it and Style.background_color appear to alter the same core behavior, but the former overrides the latter. This said, none of the stock Pygments styles use a background color for the root Token, and since its effects appear better set via Style.background_color, I believe we wont have to deal with it; i.e. there's no need to do something with the second capturing group from https://github.com/hmedina/sphinx/blob/37fabbeb38f6f977db688003a273ece44d6c42d5/sphinx/highlighting.py#L313

hmedina avatar Jun 24 '25 06:06 hmedina

Hey there, I'm a bit confused; I did not understand your suggestion [...] so a more verbose explanation would be appreciated.

Sorry about that. At that time the GitHub interface indicated that this branch could not be rebased before merge without conflicts being resolved first. Hence, as previous reviews were by now obsolete, I thought about rebasing the whole thing, which would have preserved commits attributes. Apart from the usual conflicts in CHANGES.rst, there were a few other conflicts which could not be resolved automatically. I practiced the exercice at jfbu/sphinx/multistyle_rebased, and an option for you was simply to hard reset your branch to that branch and force push here. Anyway as per @AA-Turner advice we can simply forget that for now.

  • I obviouly need to freshen up on what is our PR merge habits, I was under the impression we basically squashed and rebased everything when merging to master, to keep a purely linear history.
  • Following up on @AA-Turner's "Please respond to the general feedback in the discussion issue" I since identified the associated discussion which I was not aware of. To the extent replying to your latest comment appear to belong there, this is where I will continue this.

jfbu avatar Jun 24 '25 07:06 jfbu

so that we can keep the PR to just discussing the implementation

It would be interesting at any rate for it to become possible for an extension to provide the feature. Regarding the PDF via LaTeX output, let me recapitulate briefly (... I try but rarely succeed) the specifics which needed attention:

  • Pygments provides a "LaTeX stylesheet" for each style, which tells its mark-up how to behave (say use boldface or italics, use a given color, etc...). For multiple styles to co-exist (and excluding the option to re-execute the stylesheet at each code-block), we need the LaTeX command names to embed suitably the stylename, or alternatively a unique integer (which will be converted to a-zA-Z letters). The stylename often contains a dash or an hyphen which are not allowed. So the option was taken to use md5 and keep the first 6 hexadecimal digits.
  • When LaTeX will render a given code-block it will have to know the stylename and the recipe used for converting it into a prefix for LaTeX commands. With the md5 approach, it is only a matter to compute it locally.
  • The above issue about LaTeX command prefixes being solved, to fully support the feature we only needed to find a way to make two pieces of information available to the final LaTeX code:
  1. the background color,
  2. the default color for non-especially highlighted color. This latter information is not part of the standard Pygmentize output for LaTeX, it is needed especially for dark styles, if some tokens are not given a specific color, because default black in LaTeX would be sort of invisible then.[^1]
  • The solution for this transfer of information was to extend the stylesheets with the data, encapsulated in LaTeX commands having the md5 determined command prefix. The LaTeX code rendering the code-block, which knows the md5, will now query whether the stylesheets did the corresponding definition and if yes it will use it.

The actual Pygments mark-up using command \PYG (contents of the "sphinxVerbatim" environement) remains unaltered. This was important to not break the internal mechanism for the automatic softwrapping or hardwrapping of long codelines and to have it work out of the box with no change whatsoever in the existing LaTeX support sty file (sphinlatexliterals.sty).

Some testing remains to be done to check things are ok with captions, and with the continuation hints at each pagebreak, in case of long code-blocks.

In some project using Sphinx were I had to enrich the code-blocks rendering with extra output in PDF, I used a class inheriting from LaTeXTranslator and defined there

    def visit_literal_block(self, node):
        try:
            super().visit_literal_block(node)
        except nodes.SkipNode:
            self.append_our_own_extrastuff(node)
            raise nodes.SkipNode

But this worked because the feature only needed to append some stuff immediately after the code-block default rendering. In the case of this PR, we need to both prepend and append, the latter to revert local changes to background colors. It would be possible to make this easier and have only to prepend, if Sphinx LaTeX environment (here sphinxVerbatim only) had some kinds of hooks where commands can be added and executed within the environment. LaTeX since some years has this built-in (here that would be \AddToHook{env/begin/sphinxVerbatim}), but we have to wait some more years before Linux distros all have recent enough LaTeX.

[^1]: I have observed in the past such invisible characters when trying out dark styles for PDF output, but I have not had the time to devote to recover a reproducer to convince there was indeed some things to be done here.

jfbu avatar Jun 24 '25 20:06 jfbu

Le 24 juin 2025 à 22:17, Jean-François B. @.***> a écrit :

So the option was taken to use md5 and keep the first 6 hexadecimal digits.

and convert 0-9 to letters G-P as digits are not allowed either there.

jfbu avatar Jun 25 '25 06:06 jfbu