julia icon indicating copy to clipboard operation
julia copied to clipboard

add Markdown.postwalk and prewalk

Open stevengj opened this issue 1 year ago • 6 comments

This PR adds Markdown.postwalk and prewalk functions for Markdown.MD expressions based on the corresponding functions in MacroTools.jl. This lets you do simple transformations of Markdown expressions without knowing much about the internals of the Markdown module.

For example, with

julia> txt = md"*Italic* and *bold* with `const code` and [link text](url)."

which produces:

Italic and bold with const code and link text.

You can transform txt to uppercase with:

julia> postwalk(x -> x isa String ? uppercase(x) : x, txt)

which produces

ITALIC AND BOLD WITH const code AND LINK TEXT.

Note that only "plain text" appears as String arguments to the postwalk function, not code or URLs or other text with a technical meaning. You can also see how the postwalk function traverses the expression with, for example:

julia> postwalk(x -> (@show x; x), txt);
x = "Italic"
x = Markdown.Italic(Any["Italic"])
x = " and "
x = "bold"
x = Markdown.Italic(Any["bold"])
x = " with "
x = Markdown.Code("", "const code")
x = " and "
x = "link text"
x = Markdown.Link(Any["link text"], "url")
x = "."
x = Markdown.Paragraph(Any[Markdown.Italic(Any["Italic"]), " and ", Markdown.Italic(Any["bold"]), " with ", Markdown.Code("", "const code"), " and ", Markdown.Link(Any["link text"], "url"), "."])
x = *Italic* and *bold* with `const code` and [link text](url).

My motivation here is that I really wanted to lower the "heading level" when displaying docstrings in IJulia (https://github.com/JuliaLang/IJulia.jl/issues/766 … see also #22870), but this transformation is impossible to implement without a postwalk-like function that requires lots of knowledge of the internals of the Markdown module, and it seems to me that this belongs in the Markdown package itself.

Currently, prewalk and postwalk are not exported (using them effectively still might require some knowledge of internals, e.g. of the specific node type you want to transform).

Needs tests and NEWS, but I wanted to check what people think first.

stevengj avatar Apr 24 '24 00:04 stevengj

postwalk and prewalk functions for Markdown.MD expressions based on the corresponding functions in MacroTools.jl

Imagine if we had totally general "walks" in Julia that don't require manual reimplementation for any specific case – be it Markdown or Expr...

Oh wait!

We actually have! image

:)

Of course here we are talking about Base and suggesting packages is off-topic, but still find it nice.

aplavin avatar Apr 24 '24 15:04 aplavin

image

This is a different result than the one I showed — you are recursively modifying all strings, even if they are code or LaTeX formulas or URLs or other things with a technical meaning. To just process the user's "plain text", you need a tree walk that knows more about the semantics of the markdown datastructures.

stevengj avatar Apr 24 '24 16:04 stevengj

Totally, I'm not saying that generic walks is an immediate solution to all problems. The cleanest way here would be to have a PlainText type in markdown trees, or something like that. Otherwise, no matter if using generic walks or MD-specific ones from this PR, one cannot directly say "uppercase all text outside of link labels". While technically possible, this would require fancy pre-walk with placeholders.

Anyway, I personally don't have anything against these specialized methods here. No way around that if they are needed by something in Base.

aplavin avatar Apr 24 '24 17:04 aplavin

Regardless, I think we should have postwalk/prewalk functionality in the Markdown package itself.

stevengj avatar Apr 24 '24 17:04 stevengj

See also MarkdownAST.jl which could be an alternative for IJulia.

fredrikekre avatar Apr 24 '24 20:04 fredrikekre

Yes, I would also recommend converting Markdown -> MarkdownAST and then using AbstractTrees to walk. I don't think we want to add any more features to the Markdown stdlib. We should really consider it deprecated.

mortenpi avatar Apr 24 '24 23:04 mortenpi

We should really consider it deprecated.

That's not going to happen in the foreseeable future (i.e. pre-Julia 2.0), because it's used by default for docstrings, and in general is used extensively throughout the whole Julia ecosystem. As such, it would be a huge mistake IMO to treat the Markdown stdlib as if it were abandonware.

stevengj avatar May 27 '24 13:05 stevengj

That's not going to happen in the foreseeable future (i.e. pre-Julia 2.0), because it's used by default for docstrings

That's why we can't remove it pre-2.0. But we can deprecate it, -ish. Anything new can be built on top of MarkdownAST, which can losslessly convert from the Markdown stdlib representation, and already provides treewalking etc. via AbstractTrees, and is in general easier to contribute to and extend, since it's not tied to Julia releases etc. And any existing users of the Markdown stdlib should transition away from directly using it. (My two cents.)

mortenpi avatar May 27 '24 22:05 mortenpi