RFC - Extension API
Allow users to inject functions into the pipeline to transform Markdown and HTML.
Suppose you want to transform all headings into H1 and add a class topic to each one of those headers. You could transform the Markdown to inject HTML or transform the generated HTML to transform the h tags, ie: do all the transformations in the Markdown phase or in the HTML phase, which is not ideal because each phase has its own rules and semantics. So we want to provide an unified API where you can inject transformation functions into each phase that makes more sense. In this example the API would look like:
markdown = """
# Get Started
## Install
"""
update_headings_to_level_1 = fn pipeline ->
tree =
MDEx.find_and_update(pipeline.tree, "heading", fn ->
{"heading", [{"level", _}]} ->
{"heading", [{"level", 1}]
other ->
other
end)
%{pipeline | tree: tree}
end
set_topic_class_h1 = fn pipeline ->
tree =
MDEx.find_and_update(pipeline.tree, "h1", fn ->
{"h1", _}_ ->
{"h1", [{"class", "topic"}]
other ->
other
end)
%{pipeline | tree: tree}
end
MDEx.new(markdown: markdown)
|> MDEx.append_md_steps(
update_headings_to_level_1: &update_headings_to_level_1/1
)
|> MDEx.append_html_steps(
set_topic_class_h1: &set_topic_class_h1/1
)
Executing this pipeline results in:
MDEx.run(pipeline)
# <h1 class="topic">Get Started</h1>
# <h1 class="topic">Install</h1>
If you're familiar with Floki, Req, and MDX you'll feel at home.
On MDX you can add plugins into the Markdown phase (remark plugins) or into the HTML phase (rehype plugins), the idea is the same but using the Req style of manipulating the pipeline with functions, and the AST format comes from Floki so we can have an unified API for both Markdown and HTML.
Names are subject to change.
I'd say an implementation akin to how Plug and other composable pipeline "things" in elixir work, where you either provide a function that accepts and returns an object, or a module that implements a Behavior with a callback or two.
A stab at the loose behaviour could be something akin to
defmodule MDEx.Extension do
@doc "Transforms that occur on the markdown phase"
@callback pre(pipeline :: term()) :: {:ok, term()} | {:error, term()}
@doc "Transforms that occur on the HTML phase"
@callback post(pipeline :: term()) :: {:ok, term()} | {:error, term()}
defmacro __using__() do
quote location: keep do
@behaviour MDEx.Extension
@impl MDEx.Extension
def pre(pipeline), do: {:ok, pipeline}
@impl MDEx.Extension
def post(pipeline), do: {:ok, pipeline}
defoverridable MDEx.Extension
end
end
You'd then use it like such
defmodule MyExtension do
use MDEx.Extension
@impl MDEx.Extension
def pre(%{tree: tree} = pipeline) do
tree
|> MDEx.find_and_update("heading", fn ->
{"heading", [{"level", _}]} ->
{"heading", [{"level", 1}]
other ->
other
end)
|> then(&{:ok, %{pipeline | tree: &1}})
end
@impl MDEx.Extension
def post(%{tree: tree} = pipeline) do
tree
|> MDEx.find_and_update(pipeline.tree, "h1", fn ->
{"h1", _}_ ->
{"h1", [{"class", "topic"}]
other ->
other
end)
|> then(&{:ok, %{pipeline | tree: &1}})
end
end
And register it with MDEx in a manner akin to
MDEx.new(markdown: markdown)
|> MDEx.append_extension(MyExtension)
Note that if you pass a function, instead of a module, it would just call the function. There would need to be a way to tag what step this applies to with single functions.
I'll be keeping an eye on this, as I'll probably want to make a clone of the functionality in my djot repo
Hey @paradox460 thanks for sharing your thoughts! I've been thinking about this API and what I currently have in mind is similar to your proposal but using plain functions instead of a module. The design is actually very similar to Req.Request - I won't say it's identical because it does have a few fundamental differences, the main one is that we have parse and format steps as opposed to request and response. Parse receives the Markdown AST and must return a transformed AST at the end of the pipeline, while format is used to output such AST to a friendly format as HTML, XML, LiveView, and others. That means we can't assume the output is always HTML.
So it's definitely based on Req's API, which has the huge benefit of being an API that people are used to work with, the barrier to write and using plugins in that format is lower. I talked to Wojtek about reusing his code and he was super kind to allow it and also supportive with the idea.
Rendering a markdown to HTML with Mermaid graphs would look like:
html =
MDEx.new(
markdown: """
graph TD;
A-->B;
""",
extension: [autolink: true]
# may pass other options from https://hexdocs.pm/mdex/0.1.18/MDEx.html#to_html/2-options
# probably need to register those options too
)
|> MDEx.Mermaid.attach() # from package :mdex_mermaid to be created yet
|> MDEx.HTML.attach()
IO.puts(html)
And the plugins:
defmodule MDEx.Mermaid do
@moduledoc """
Inject Mermaid JS and renders mermaid code blocks
"""
@required_opts [
render: [unsafe_: true],
features: [sanitize: false]
]
def attach(%MDEx.Pipe{} = pipe, opts \\ []) do
pipe
|> MDEx.Pipe.register_options([:mermaid_version])
|> MDEx.Pipe.merge_options(opts)
|> MDEx.Pipe.merge_options(@required_opts) # still not sure the best approach to handle required opts
|> MDEx.Pipe.append_parse_steps(load_mermaid: &load_mermaid/1)
end
defp load_mermaid(parse) do
# pretty much the same code as https://github.com/leandrocp/mdex/blob/5987418685e87f7ef85babd945416306b56c6536/examples/mermaid.exs#L40-L60 but with a couple changes:
# Use `options[:mermaid_version]` to load specific version or defaults to latest
# Only transform `code_blocks` where literal == "mermaid"
# ... return transformed AST
end
end
defmodule MDEx.HTML do
@moduledoc """
Render Markdown AST as HTML
"""
def attach(%MDEx.Pipe{} = pipe, opts \\ []) do
pipe
|> MDEx.Pipe.append_format_steps(to_html: &to_html/1)
end
defp to_html(pipe) do
Map.put(pipe, :output, MDEx.to_html(pipe.parse, pipe.options)
end
end
And the struct holding everything together:
defmodule MDEx.Pipe do
defstruct [:options, :parse_steps, :format_steps, :output, :halted, :private]
end
I'm not so sure about the name %MDex.Pipe tho, it feels too generic.
I’m currently playing with this and I want to modify the approach in mermaid.exs to only inject the javascript on pages where mermaid is present. This isn't possible with the current implementation of MDEx.traverse_and_update/2, but it would be useful if the extension API and/or traverse_and_update/2 could update the options or some other state (private?) so that decisions like this could be made based on what is present, rather than what is assumed that might be present.
The above is correct for MDEx.traverse_and_update/2, but MDEx.Traversal.traverse_and_update/3 will do this, although it is not documented and not exposed from the top level.
@halostatue I'm almost done with some improvements on the API that will expose traverse_and_update with an accumulator (acc argument) and also an implementation of Access and Enumerable protocols to let you query/search the AST more easily.
This seems like something I could use as well.