Pollen.jl icon indicating copy to clipboard operation
Pollen.jl copied to clipboard

Markdown output

Open jkrumbiegel opened this issue 2 years ago • 6 comments

I've thought about using Pollen to rewrite a markdown source into another markdown source, for use with tools such as Docusaurus or MkDocs. Currently, markdown output is not implemented, but I think the shortest path to get it would be to make a CommonMark.jl AST out of the tree structure again to use its printing functionality.

My idea is to handle Makie's plot generation / code execution with Pollen.jl and then let other tools handle the static site generation. It's just too much work for most projects to exhaustively handle all that website stuff as I've seen working on the Makie docs, so I think it would be good to attempt that kind of split.

jkrumbiegel avatar Sep 17 '22 15:09 jkrumbiegel

Markdown output is definitely something that should be in the library :+1:

However, I think implementing a render method for MarkdownFormat and dispatching on the markdown tags may be easier to implement than converting to CommonMark.jl AST and easier to customize, unless you're already familiar with how to construct those ASTs. This also gives more flexibility on how to render rich data like images, e.g. making it easy to dump the image data in a file and render a markdown link to it.

Feeding Pollen.jl projects into different frontends is very much in the spirit of the library, by the way, I just haven't gotten to any others yet :)

lorenzoh avatar Sep 17 '22 19:09 lorenzoh

Cool, ok maybe it's not as difficult to write out the markdown manually, I thought we could save some effort there by using CommonMark but its AST is not a normal tree, so would be a bit annoying to set up.

This also gives more flexibility on how to render rich data like images, e.g. making it easy to dump the image data in a file and render a markdown link to it.

Regarding that, I didn't quite understand when the different steps happen. There's rewritesources! but also build. I would have thought that executing code and replacing with results is a rewriting step but it seems to be a build thing?

jkrumbiegel avatar Sep 19 '22 08:09 jkrumbiegel

Regarding the different stages:

  • rewritedoc is a pure function that is applied to each document indivdually, returning a new document. This should be preferred where possible, as it's possible to parallelize and rebuild the project incrementally when a single document changes
    • ExecuteCode uses this to replace code blocks with codeblocks + outputs
  • rewriteoutputs! is applied after multiple documents has been rewritten and can take into account multiple documents. That is sometimes necessary as in when building a graph of links between documents.
  • build then is also run when rewriteoutputs! is run, but is specific to the output format. The default here is FileBuilder which writes a file for every document in a project. This should contain only functionality that is output format-specific. All rewriting of documents should be done in the earlier stages.

lorenzoh avatar Sep 23 '22 16:09 lorenzoh

So in which of those would saving to image files (from code execution) happen, and which one would be good for a compression pass and assembling a gallery page?

jkrumbiegel avatar Sep 23 '22 16:09 jkrumbiegel

Saving to image files should be done in the build step since only then do you know the directory build to. But you could look for all :coderesults with an image MIME available during the rewritedoc stage, store them in a dict and replace the nodes with Markdown media links, then use the postbuild hook to actually write the image files to the build directory.

lorenzoh avatar Sep 23 '22 17:09 lorenzoh

Something along these lines:


struct MarkdownImages <: Rewriter
    images::Dict
end

function Pollen.rewritedoc(rewriter::MarkdownImages, docid, doc)
    cata(doc, SelectTag(:coderesult)) do node
        val = only(children(node))[]
        if showable(MIME("image/jpeg"), val)
            id = Base.uuid5()
            # Store the value
            rewriter[id] = val

            # Insert link
            return withchildren(node, [Node(:img, src = "/static/$id.jpg")])
        end
    end
end

function Pollen.postbuild(rewriter::MarkdownImages, _, builder::FileBuilder)
    for (id, val) in collect(rewriter.images)
        FileIO.save(joinpath(builder.dir, "static/$id.jpg"), val)
        # only need to save these once. if they change they will be readded
        delete!(rewriter.images, id)
    end
end

The only thing missing now is the render! implementation for MarkdownFormat which needs to handle these :img nodes.

Also, this rewriter will of course have to be inserted after the ExecuteCode rewriter, otherwise it will not find any :coderesult tags and not do anything.

lorenzoh avatar Sep 23 '22 17:09 lorenzoh