jupytext icon indicating copy to clipboard operation
jupytext copied to clipboard

Support the `#export` tag of `nbdev`

Open mwouts opened this issue 3 years ago • 3 comments

This came out of a chat with @hamelsmu on how to improve the readability of pull requests on notebooks in the context of nbdev, a library that builds Python libraries out of Jupyter notebooks.

We are willing to test the following idea: could we substitute the nbdev export step with paired notebooks (.ipynb and .py in the percent format) ?

Of course this would come with limitations (currently nbdev can export code from multiple notebooks to multiple scripts, while paired notebooks are limited to one script per notebook).

The expected improvements are:

  • Version control is clearer (the inputs for the notebooks can be reviewed as plain text in the .py file)
  • The export of the notebook to the .py file occurs every time the notebook is saved
  • The .py file becomes editable (but sync with the .ipynb file needs to be garanteed, e.g. with a pre-commit hook, or at least with a CI check)

For this we basically need to support the #export keyword in nbdev notebooks, and comment out all code cells that don't have it (a tentative implementation is at #948).

The impact on a sample nbdev project is shown in this PR.

mwouts avatar Apr 20 '22 06:04 mwouts

Hi @hamelsmu, would you like to go on with the project? If so, I have two easy questions for you at #948 (should we only support #export or more complex patterns, and how do you want the option to be called), let me know when you have time to comment on them.

mwouts avatar May 05 '22 21:05 mwouts

Hello @mwouts sorry about this, I got behind due to some work things that have come up, I hope to get back to this soon. I should be able to look at this in more detail in the next 2 weeks. Sorry for the delay

hamelsmu avatar May 06 '22 03:05 hamelsmu

Hello @mwouts I took a look at this (thanks again for the reminder). We are using directives in the style #|export because we are leveraging quarto for nbdoc. Do you think that style of comment can be supported instead?

There are other directives that we have beyond #|export so if there was a way that additional directives with #| could get captured in the metadata somehow I think that would be useful for doing a roundtrip.

One really useful thing we add when going from notebook to script is we add the __all__ at the top, which greatly helps when someone tries to import * a module. Here is an example of this https://github.com/fastai/nbprocess/blob/master/nbprocess/maker.py#L4

the other thing is we have headings that allow you to run lines as cells that look like this, but that capture metadata such as the source notebook and the cell number for reverse syncing. Perhaps the source notebook might be most interesting since you have a mechanism already for the two way sync

# %% ../nbs/02_maker.ipynb 31

We also do some post processing we do on notebooks, but we are primarily concerned with going from nb -> Markdown, whereas I believe the scope of JupyText is more about preserving full context of a notebook so you can have a human-readable representation that is also a python script? Some examples of this are.

  • We remove Markdown headings that end with a dash - , we use this when we want headings for navigation but we do not want them to end up in the docs.
  • We add front matter to the markdown representation of the notebook.

Let me know your thoughts on these

hamelsmu avatar May 06 '22 13:05 hamelsmu