marimo icon indicating copy to clipboard operation
marimo copied to clipboard

Notebooks as markdown

Open baggiponte opened this issue 1 year ago • 9 comments

Description

One of marimo's strengths is the notebook being stored as a pure Python file, rather than a cumbersome JSON. However, the python file can be complex to edit too, due to every cell being stored as a Python function. If notebooks could be stored as markdown files it would be of great help!

Suggested solution

You could adhere to the MyST markdown spec: https://mystmd.org/guide

This would imply that the markdown is parsed into a DAG, where each code block (delimited by ```) and everything else is just wrapped into mo.md.

The cool feature about myst (and previously RMarkdown files) is that you can write something like:

The result of the operation is {eval}a

And at runtime a will be swapped with the __repr__ of the variable. This could be achieved with marimo easily.

Alternative

Maybe you could adhere to Quarto's spec. I am not completely aware of the compatibility/interop/advantage of one over the other. See something here.

Additional context

No response

baggiponte avatar Apr 06 '24 16:04 baggiponte

Quarto allows for filters that could easily be applied for Marimo (I think) https://github.com/quarto-dev/quarto-cli/discussions/8049

I would really like this and have mentioned it before +1


That thread is good context, but the project quarto-pydiode https://github.com/coatless-quarto/pyodide is probably most relevant to marimo

dmadisetti avatar Apr 06 '24 18:04 dmadisetti

Wow, 11 thumbs ups in 2 days! Message received :)

We will look into this. Neither of us are very familiar with Myst or Quarto, but we'll look into both, and we'll share what we're thinking here. We'd definitely appreciate input and help.

akshayka avatar Apr 08 '24 17:04 akshayka

I think WASM only might be as easy as copying the pydiode or holoviz https://github.com/awesome-panel/holoviz-quarto extensions.

For full reactive python output, it's a little trickier since the whole document has to be processed first, but you could have a hack with 2 filters:

  1. The first filter gets the block and sends it to a Marimo background process, and puts in a HTML stub with cell-id.
  2. The second filter finds the stub and gets the cell output based on id, and replaces the stub.

Proper kernel integration would probably be better, or something deeper than just filters- managing a background process is definitely a work around

dmadisetti avatar Apr 08 '24 17:04 dmadisetti

We will look into this. Neither of us are very familiar with Myst or Quarto, but we'll look into both, and we'll share what we're thinking here. We'd definitely appreciate input and help.

That's just a reference implementation - ofc you can choose whatever you prefer. Quarto is written in TypeScript anyway. I think the parsing from markdown to an "internal representation" should not be super hard (though I don't know where it should happen: at the Python or TS level?). The challenge might be at the code design - how is the DAG building process implemented?

baggiponte avatar Apr 09 '24 08:04 baggiponte

Just to confirm: you all prefer markdown over a flat Python representation?

For example, I know vscode lets you use a special comment (#%%) to chunk a Python script into cells: https://code.visualstudio.com/docs/python/jupyter-support-py#_jupyter-code-cells.

Would you prefer markdown over this?

akshayka avatar Apr 10 '24 17:04 akshayka

I can only speak for myself, but I store my notes as markdown- and have typically exported my polished notebooks to markdown for file readable reference

So yes. Markdown is nice because it's content vs code forward

dmadisetti avatar Apr 10 '24 17:04 dmadisetti

+1 on markdown. If it's a compliant spec, you can use pandoc to turn your notebook into pdf, latex, word documents...

baggiponte avatar Apr 10 '24 21:04 baggiponte

Here's a proof of concept with quarto

https://github.com/dmadisetti/quarto-marimo

rendered site here: https://dmadisetti.github.io/quarto-marimo/intro.html

dmadisetti avatar Apr 13 '24 17:04 dmadisetti

Non wasm (UI is dead) and had to fool react to get the elements to render. ~Doesn't support raw outputs (e.g. lists and what not)~ (it does)

It would be cool to have a hybrid approach where {marimo} runs server side and {marimo-wasm} can access dumped outputs and make the rest of the execution tree reactive.

Server in background is definitely a hack, posted to quarto discussions for ideas: https://github.com/quarto-dev/quarto-cli/discussions/9362

dmadisetti avatar Apr 13 '24 17:04 dmadisetti

this is now complete thanks to @dmadisetti via https://github.com/marimo-team/marimo/pull/1332

let us know if there are any further enhancements to be made

mscolnick avatar May 17 '24 15:05 mscolnick