markdown-it-py icon indicating copy to clipboard operation
markdown-it-py copied to clipboard

Option to skip adding `<pre><code>` to highlighted code

Open andersk opened this issue 1 year ago • 5 comments

Context

When using the highlight option to provide a custom syntax highlighter, markdown-it-py wraps the HTML output of the highlighter in <pre><code> unless it already starts with <pre:

https://github.com/executablebooks/markdown-it-py/blob/73a01479212bfe2aea0b995b4d13c8ddca2e4265/markdown_it/renderer.py#L270

But that heuristic fails for pygments.highlight, whose output does not begin with <pre:

>>> pygments.highlight('print("hello")', pygments.lexers.get_lexer_by_name("python"), pygments.formatters.HtmlFormatter())
'<div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="s2">&quot;hello&quot;</span><span class="p">)</span>\n</pre></div>\n'

So markdown-it-py turns this into <pre><code><div class="highlight"><pre>…</pre></div></code></pre>, and existing CSS themes for Pygments need to be rewritten to account for the unnecessarily duplicated <pre>.

Proposal

Can we have an option to skip adding <pre><code> that’s not subject to the heuristic?

Tasks and updates

No response

andersk avatar Mar 18 '23 08:03 andersk

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

welcome[bot] avatar Mar 18 '23 08:03 welcome[bot]

An alternative could be to subclass RendererHTML and override fence():

class CustomRendererHTML(RendererHTML):
    def fence(self, tokens: Sequence[Token], idx: int, options: OptionsDict, env: EnvType) -> str:
        token = tokens[idx]
        info = unescapeAll(token.info).strip() if token.info else ''
        langName = info.split(maxsplit=1)[0] if info else ''

        if options.highlight:
            return options.highlight(
                token.content, langName, ''
            ) or escapeHtml(token.content)

        return escapeHtml(token.content)

Then this class can be selected to get the desired behavior: markdown_it.MarkdownIt(renderer_cls=CustomRendererHTML)

pydsigner avatar Jul 25 '23 06:07 pydsigner

Upvoting this for my meeting this problem. Also, "highlight" is something that is almost undocumented. Maybe it needs more attention.

ZeroAurora avatar Nov 24 '23 18:11 ZeroAurora

Just implemented this, it seems ok:

pygments_style = get_style_by_name('catppuccin-mocha')


def highlight_func(code: str, lang: str, _) -> str | None:
    """Highlight function using pygments."""
    if not lang:
        return None

    lexer = get_lexer_by_name(lang)
    formatter = HtmlFormatter(style=pygments_style, noclasses=True, nowrap=True)
    return highlight(code, lexer, formatter)


md = MarkdownIt('js-default', {'highlight': highlight_func}).enable('table')

nowrap=True tells pygments to not add any div, pre. So it let's markdown-it do it.

dimitrilarue avatar Jul 22 '24 21:07 dimitrilarue

@dimitrilarue No, that’s the opposite of what I need. The extra classes from Pygments such as <div class="highlight"> are important for styling, so I need to be able to tell markdown-it-py not to generate its own wrappers.

andersk avatar Jul 22 '24 21:07 andersk