livebook icon indicating copy to clipboard operation
livebook copied to clipboard

Support for pandoc integration

Open josevalim opened this issue 3 years ago • 7 comments

In order to support pandoc integration, and eventual conversion to latex and other documents, we need to:

  1. [ ] Do not discard front matter from Markdown
  2. [ ] Allow metadata on Kino.render/2. This will be stored as part of the HTML comments
  3. [ ] Persist tables as part of the output

Then someone can write a tool that preprocess the .livemd into a markdown document with tables, images, etc and passes that to pandoc.

We don’t plan to tackle this at the moment but if someone plans to use Livebook to write papers, they are welcome to explore these ideas and tools!

josevalim avatar May 02 '22 11:05 josevalim

Then someone can write a tool that preprocess the .livemd into a markdown document with tables, images, etc and passes that to pandoc.

In my opinion, the best course of action is actually to implement a pandoc reader upstream. Here's the markdown one for comparison: https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/Markdown.hs

This has the following benefits:

  • It's natively supported by pandoc (so no installing external tools or pandoc filters in required)
  • Information will not be lost in the livemd to md transpilation (no examples come to mind however)
  • It probably will be more perfomant than any Elixir equivalent.
  • It probably will be better maintained and of better quality than any community tool.
  • You get to write Haskell code :)

But has the following disadvantages:

  • Less freedom to change the format
  • We may need to work with pandoc community to make changes
  • A PR for support may not be even accepted, with the reason being that Livebook is not popular enough.
  • You have to write Haskell code :)

If we do implement a reader, we should also implement a writer.

Jupyter Notebook has gone this route, with a reader and a writer.

Benjamin-Philip avatar Jul 10 '22 05:07 Benjamin-Philip

That's good to know. I think for now the ability to iterate is more important and once the format is stable enough we can reach out to pandoc.

josevalim avatar Jul 10 '22 07:07 josevalim

I think for now the ability to iterate is more important and once the format is stable enough we can reach out to pandoc.

I agree. I think the first course of action is to start standardising and writing specifications for the format.

Benjamin-Philip avatar Jul 10 '22 08:07 Benjamin-Philip

Is there something defined or someone working on these specifications?

viniciusalonso avatar Aug 26 '23 20:08 viniciusalonso

There are no plans for now. But Livebook is a subset of Markdown so there is not much to specify. We do add HTML comments, but I don't think they would matter for the Pandoc integration. And we also use "triple-backtick", such as ```output and ```mermaid, but that's regular Markdown behaviour and GitHub, for example, supports its own backticks (including mermaid).

Anyway, I disagree that the first course of action is "to start standardizing and writing specifications" because that assumes the best course of action is by adding a reader/writer to Pandoc. We already have all of the tooling written in Elixir, I would rather write the .livemd to .md layer in Elixir first and play with ideas, than assume the starting point is rewriting the existing still-in-flux stack in another language.

josevalim avatar Aug 27 '23 07:08 josevalim

Hello there !

For educational purposes, I've had to share the work I've done on Livebook in an offline HTML static file.

Because I only had little success improving Livemarkdown to exports outputs with embedded VegaLite charts (keeping them interactive) + data tables, I ended up doing the work upside-down, with an escript which drives a whole serverless Livebook otp app instance for cells evaluation. For its part, a silly dep to add to livemd overrides some Kinos (meant to be interactive/js live/server-driven), to generate static HTML outputs.

While this does the job I needed, I keep in mind the roadmap presented in the issue is the way to go (and was, to some degree, an initial intention for me). I will try to investigate further back this way, with the knowledge I've earned achieving my little own goal. (That journey was awesome, but I'm pretty sure that nothing I've done for my lib has value in achieving that roadmap)

I may first start trying to write a paper with some flowcharts on how Livebook / Notebook / Kinos / evaluations / client-js work altogether in my (pretty limited) understanding.

That said, the silly lib/tool is here, for reference: ripmd

Cheers !

clm-a avatar Sep 25 '23 09:09 clm-a

Oops, I should have taken a look at this earlier (and understand better what is going on in the upstream) https://github.com/livebook-dev/kino/pull/321 :)

clm-a avatar Sep 26 '23 00:09 clm-a