DOC: Add interactive notebooks to pages in the "Usage Examples" section
Description
[!TIP] Those interested can try it at https://pywavelets--741.org.readthedocs.build/en/741/regression/index.html
The Usage Examples section, following #728 and as requested in https://github.com/PyWavelets/pywt/pull/737#discussion_r1589709771, is rendered interactive through the changes in this PR through the use of jupyterlite-sphinx and Markdown-based notebooks that are executed by MyST-NB. Please read below for a granular overview of all of these details:
Which issue does this PR solve/reference?
Addresses a part of #706
Key changes made
- All
.rstbased files underdoc/source/regressionconverted to Markdown and reformatted as notebooks - Jupytext frontmatter added to all of the notebooks so that they can be executed at the time of building the documentation
- Code cells that raise an error marked with cell-based tags
- MyST-NB is configured in
conf.py -
NotebookLite directive used from
jupyterlite-sphinxto run the notebooks under WASM in the same tab (and something like https://github.com/jupyterlite/jupyterlite-sphinx/pull/165 can be added for this directive as well) - A custom, minimal Sphinx extension added to execute the Jupytext CLI before the sources and toctrees are read, and to selectively ignore the
.ipynb-based notebooks that get generated during the process in order to keep the documentation build warning-free. Generating them at build time is for theNotebookLitedirective to be able to access them and load them, since the NotebookLite directive currently does not load.mdfiles or notebook files in other formats. i. I imagine this will be helpful for https://github.com/scipy/scipy/pull/20303 as well where it is required to load the notebooks in an interactive manner under a specific folder. The notebooks are not executed by Jupytext at the time of conversion, and therefore, based on past experiments, it takes ~10 seconds to convert 30 or so notebooks.
Additional context
This is just a pilot run of how notebook-based examples can be configured for Sphinx-based documentation websites, so there are a few corner cases that I have noticed so far:
- The styling of the NotebookLite directive can be improved because it takes way too much of the screen's space, perhaps through a user-provided option in
conf.pyjust like how theTryExamplesdirective can be configured. - References to the API for a package (in this case, public classes and methods for PyWavelets) are not linked to when running the notebook in its inline frame, because the notebook is not running under Sphinx and cannot access Sphinx's generated HTML pages. This can be worked around via better writing the notebooks by exploring other ways to reference pages and sections from the API (say, by adding Markdown-based headings or links, etc.).
A brief to-do list
Besides the points mentioned above, smaller tasks can be looked into:
- [x] Hide reST-based syntax (at least the irrelevant top parts of each page) using MyST
- [x] Improve the styling in other ways
- [x] Provide a button to download notebooks (both
.mdand.ipynb)
Quite strange. The documentation is building without any problems locally...
Edit: I can reproduce locally if I delete a few of the previously generated files – this is coming from jupyterlite-sphinx. I think that Read the Docs is using a cached version of the documentation and is not purging the files properly, or it isn't allowing the subprocess to execute at the config-inited builder hook.
Edit, again: they were failing because the jupytext command was failing silently and was not converting any notebooks. Removing the hardcoded path fixed it.
I added a basic extension to clean up the generated IPython notebooks in 6bc9c601d73b1116058db3a44e2437c85fc9c02f using nbformat, works quite well. I am now exploring a method to add the directives based on https://github.com/PyWavelets/pywt/pull/741#discussion_r1619280967 automatically – it requires some configuration but I feel that it can be achieved at the source-read event before the Markdown files are processed.
Processing Markdown programmatically doesn't bear good fruit – it would have been easier if we used IPyNB, and we don't plan to. I would suggest that we should merge this for now, @rgommers.
The issue with Markdown was that there isn't a clean way to add both the Jupytext frontmatter and the notebook directives and that requires some obtrusive, prone-to-failure file manipulation by processing the contents as a string (we can keep the notebook directives at the bottom and the frontmatter intact, but it's easier to provide the download buttons at the top of the page rather than at the bottom).
Weird, the local docs build and RTD both produce cleaner downloadable notebooks, but the NotebookLite directive on RTD doesn't have the cleaner notebooks (they still have the Sphinx directives). This does not happen locally. I think RTD has some caching troubles, because I removed the redundant module headings from the notebooks (such as (reg-dwt-idwt)= and so on), but they are still there on RTD.
Local build
Read the Docs
Edit: some notebooks are as expected, and some don't.
Edit 2: this is most likely the case of a cached build or some updates that don't trickle down the RTD, the logs say that all the notebooks are converted. I can't reproduce locally, and I feel that the issue will go away automatically on a fresh build.
I had a revelation just now and I think 07a09267193c3e0e7ba58ec4d2d1e679a9e65160 should fix the synchronisation issue for good – the RTD build is just buggy for some reason or the other and the local documentation builds where the issue doesn't arise can be trusted upon. Ready for review and further proceedings.
Tagging @melissawm and @steppi for a review as discussed during the 14/06/2024 interactive docs meeting – thanks, both!
This seems to work really well- I love the buttons and dropdowns.
How flexible can we make this to share with other projects? Ideally, we could have an extension (or as a part of jupyterlite-sphinx) that adds the preprocess notebooks event, common settings and looks without having to turn them on manually for each project?
This seems to work really well- I love the buttons and dropdowns.
How flexible can we make this to share with other projects? Ideally, we could have an extension (or as a part of jupyterlite-sphinx) that adds the preprocess notebooks event, common settings and looks without having to turn them on manually for each project?
I had a discussion with @agriyakhetarpal offline where we discussed how this preprocessing could be added to jupyterlite-sphinx. I think he's working on it.
Don't forget to set
jupyterlite_silence = Trueagain now that we're done debugging. After that I think this is good to merge.
Looks like that last action was done, and everything is green. Are you happy with this PR as is @agriyakhetarpal?
Yes, I'm happy with this, @rgommers. I tested all notebooks, and everything seems to be working with jupyterlite-sphinx v0.16.2. I think this can be squash-merged, but I can rewrite history and ping you again if you wouldn't prefer that because of the number of files that have changed (and their contents).
Styling-wise (please keep in mind I'm no frontend developer 😅 ) would it look better/make it more visually distinctive if we had labels, such as the following screenshot?
This is just some CSS so would not be too much of a burden and we could apply it to SciPy as well.
As far as the gallery question, my take is that the main goal here is to have live notebooks that can be edited and run from the browser immediately without having to fire up a server locally.
Thanks for the detailed review, @rgommers!
The styling of the NotebookLite directive can be improved because it takes way too much of the screen's space, perhaps through a user-provided option in conf.py just like how the TryExamples directive can be configured.
Yes, this is a pretty significant regression I think. The output is too verbose, and also harder to understand because the only difference between input and output is a bit of green coloring on the left side of the input cells (rather than >>>). Also for accessibility that is not great. It looks fixable to me, since it's only a rendering choice.
I think @steppi would have more to comment about the NotebookLite directive's button's sizing. Now that I've revisited this after some time from when I made that comment: my understanding of the problem is that the clickable element, i.e., the yellow button with the "Open notebook" text resides in jupyterlite_sphinx_iframe_container, which has a minimum height set in conf.py (in this case, this is 600px, a size that seems to fit a viewable/scrollable section of the notebooks reasonably). If we reduce this, we'll need to figure out a way to expand the div/container based on whether it was clicked or loaded, similar to how TryExamples directives create a container upon interaction.
The SciPy changes (https://github.com/scipy/scipy/pull/20303) seem to have the same problem - I haven't been able to follow that work/review, so let me ping @melissawm here for thoughts.
Yes, that work involved @melissawm expanding the JupyterLite directive to add a :new_tab: option and then using it to load the notebooks in a new tab – while I could have implemented the same thing here, the reason why I didn't do so was that the length of the notebooks in https://github.com/scipy/scipy/pull/20303 was large, while notebooks here are considerably shorter and therefore not a lot of scrolling is needed (perhaps with the exception of "Wavelet Packets", which is lengthier than the rest).
The traceback rendering is also a lot worse That's 14 extra lines of unwanted output that makes the example harder to understand.
I agree. I'm not sure how to improve this, but I'll try exploring the MyST-NB configuration to see if they have something to control the stack level for the traceback.
Wouldn't it be better to drop most/all of the links? Something like {func}
dwtisn't too clear in the downloaded or in-browser notebooks. It'd be okay to have fewer links from long-form docs to API docs I think.
Yes, sure – I'll do that in a subsequent commit and drop them all. Keeping API terms in regular backticks without any links seems fine to me. I had thought of doing this as a last resort, but I do get that the "API reference" section is always one click (and then a few) away via the navbar, so users have easy access to classes and methods.
Jupytext frontmatter added to all of the notebooks so that they can be executed at the time of building the documentation
Not a showstopper by itself, but it's a lot of boilerplate that I imagine either a Sphinx extension or a bit of codegen can remove.
I would suggest keeping the notebooks in this manner. I've posted a feature request for jupyterlite-sphinx to accept Jupytext notebooks in https://github.com/jupyterlite/jupyterlite-sphinx/issues/191 (which I would be glad to do the work for, myself) and my suggestion is that having some codegen probably isn't worth the complexity, considering that the Markdown files contain syntax such as "```{code-cell}```" which is only relevant for notebooks, and generating the frontmatter at runtime means there is some missing context for those who will try to access these files and note that they contain the necessities for Jupytext syntax, but don't contain the frontmatter (for developers, this would mean that code editors' extensions won't recognise them as MyST-flavoured notebooks). I did try this idea with a Sphinx extension, however—on a branch with commits that I did not get to push—I faced issues with connecting to an appropriate Sphinx event that could modify the files before Sphinx read them at least once. It could be handled with a new Makefile target that runs before the docs build, though.
My impression is that we lost the individual interactive buttons from the jupyterlite-sphinx
.. try-examples::directive, and we end up with something that's harder to maintain and visually not as nice as whatsphinx-gallerydoes? I'm probably missing something important here, but I don't quite see it right now.
I can certainly improve this by undertaking the task of writing a separate Sphinx extension for this or exploring Jinja-based templating. We do have the building blocks in the PR I linked above (https://github.com/sphinx-gallery/sphinx-gallery/pull/1312/). I am open to suggestions if we have to modify the approach entirely – rendering this PR a playground or a canary of sorts as a precursor to larger projects where similar changes in this manner will be carried out helps us in the way that we won't need to go back to the sounding board to improve the approach.
Styling-wise (please keep in mind I'm no frontend developer 😅 ) would it look better/make it more visually distinctive if we had labels, such as the following screenshot?
I think so, yes. In/Out labels will be familiar to many users, and even for those who have never used a notebook it's easy to understand. I'd say it's (module the vertical space taken) about the same as >>>.
As far as the gallery question, my take is that the main goal here is to have live notebooks that can be edited and run from the browser immediately without having to fire up a server locally.
Isn't that the same as what the "launch lite" button in the right sidebar of sphinx-gallery examples like https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html does?
Isn't that the same as what the "launch lite" button in the right sidebar of sphinx-gallery examples like scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html does?
Yes, but that would only work for gallery examples. I'm not sure you could use sphinx-gallery to have that in other documents. Alternatively you could migrate all of the narrative documentation you want to be interactive to sphinx-gallery, but I'm not sure how that would impact the documentation organization.
Wouldn't it be better to drop most/all of the links? Something like {func}dwt isn't too clear in the downloaded or in-browser notebooks. It'd be okay to have fewer links from long-form docs to API docs I think.
Yes, sure – I'll do that in a subsequent commit and drop them all. Keeping API terms in regular backticks without any links seems fine to me. I had thought of doing this as a last resort, but I do get that the "API reference" section is always one click (and then a few) away via the navbar, so users have easy access to classes and methods.
I did this in 1dc3a0f9c03fe5cd640be791a379919bc4db20f2 and replaced all of the links with just text that gets highlighted as code text. Though, now that I've looked at the rendered docs locally, I do think that this is a step back from having the links. The links are non-existent/don't work just in the interactive notebooks that JupyterLite brings forth, so removing them from the main documentation pages so that the interactive notebook gets to be cleaner doesn't seem to be the best way forward. @rgommers, please let me know if this was what you had in mind or whether there should be a way for jupyterlite-sphinx to clean up the links in the rendered notebook or something?
whether there should be a way for jupyterlite-sphinx to clean up the links in the rendered notebook
This would definitely be an improvement, short of having the actual links (this is on the roadmap for MyST-nb/jupytext as far as I understand). But I'm not sure this should be done now, maybe it's a future feature request for jupyterlite-sphinx?
Alternatively you could migrate all of the narrative documentation you want to be interactive to sphinx-gallery, but I'm not sure how that would impact the documentation organization.
True, that doesn't work for everything. Neither does a single notebook though, since that loses the .. try-examples:: directive that actually adds inline interactivity in .rst files. Or is there a way to still insert that?
so removing them from the main documentation pages so that the interactive notebook gets to be cleaner doesn't seem to be the best way forward.
I think I agree. Perhaps we should ignore it for now, and indeed consider whether that syntax could be auto-cleaned by jupyterlite-sphinx. It can be done later I think, it's not that critical.
We do have the building blocks in the PR I linked above (sphinx-gallery/sphinx-gallery#1312). I am open to suggestions if we have to modify the approach entirely – rendering this PR a playground or a canary of sorts as a precursor to larger projects where similar changes in this manner will be carried out helps us in the way that we won't need to go back to the sounding board to improve the approach.
It would be useful to spend some time exploring this. I don't have concrete suggestions because I don't know the various code bases of the packages involved well enough. I'm worried about doing double work here though, so some coherent picture of what is common with sphinx-gallery and what packages/components to use for which task would be useful.
My understanding is that the try_examples directive would be mostly for API examples, and that notebooklite and similar directives would be more for narrative pages so both approaches would be available via jupyterlite-sphinx - maybe @agriyakhetarpal can clarify?
Would it be better have a button like for the try_examples directive which swaps the rendered notebook in place with an executable one that occupies the same amount of screen real estate? I don't think there's any reason in principle why we couldn't do that; it just hasn't been done yet.
Would it be better have a button like for the
try_examplesdirective which swaps the rendered notebook in place with an executable one that occupies the same amount of screen real estate?
I don't think so, that seems pretty much always worse than a new tab for larger narrative docs.
We just had a chat about this with @agriyakhetarpal, @melissawm, @steppi and @Carreau. Outcomes:
-
sphinx-galleryisn't really reusable here, worries around it being too inflexible and too slow to build. Instead, some of the CSS may be useful; if so then just copy the relevant code. - We have three types of interactive docs: docstrings (in good shape),
.rstfiles where individual pieces of code get.. try-examples::directives (in good shape too), and "take this larger page of narrative docs and render as notebook with JupyterLite in a new tab (needs work). For that last item:- Only Markdown supported (
jupytextnotebooks), not.rst - We want two buttons at the top of the page (easier to do than in the sidebar), one to open as notebook in a new tab, one to download the
.ipynb - No manual boilerplate in each
.mdfile, this should at most be a single.. notebooklite::directive, or a single setting inconf.pyto opt into generating these buttons for all narrative docs pages - Next step: try implementing these buttons in
jupyterlite-sphinx. It's okay to requiresphinx-designfor buttons if needed.
- Only Markdown supported (
The traceback rendering is also a lot worse. We go from this:
There is %xmode minimal (--InteractiveShell.xmode=Minimal), but which may be too minimal.
In [5]: %xmode Minimal
Exception reporting mode: Minimal
In [6]: def f():
...: 1/0
...:
In [7]: f()
ZeroDivisionError: division by zero
Rebased on top of main based on @gabalafou's request; apologies for the force-push.
In the last few comments, I bumped to the latest https://github.com/jupyterlite/pyodide-kernel/releases/tag/v0.5.1, which brings https://github.com/pyodide/pyodide/releases/tag/0.27.1.
Now that all tests are passing, this PR should be ready – thanks to @gabalafou's help with the styling 🙈
cc: @melissawm and @Carreau; tagging you here as I can't request a review from GitHub directly.
Thanks @melissawm for this great feedback!
I will take it on me to address them:
- I ran into this too, I will open a PR to add a readme to the doc folder and link to it from the contributing doc.
- Oh interesting! I definitely intended to exclude regressions/README from the docs, as that section of the docs is aimed at end users rather than contributors. I wonder if I copied the first two paragraphs of the readme to the index file and then added a link in the index file directly to the GitHub readme along the lines of "if you're curious to learn more about the machinery behind dual doc/notebook pages"
- Hmm... opening the notebook works for me locally, I wonder if this is the result of no documentation on how to actually build the docs locally. Would you be open to trying again once I get that PR up with the instructions?
Here's the PR on how to build the docs (786)
I created another pull request on top of this one to add some of the info from the regression folder readme into the index page: https://github.com/agriyakhetarpal/pywt/pull/5
This is looking quite good now! Seems about ready once the last changes from @gabalafou have been folded in.
One question: is it expected that on the RTD preview, there is no caching of wheels? If I try multiple interactive examples on different pages, I'm always waiting 15-20 seconds for import pywt to complete, on a machine with a 300 Mbps connection.
One question: is it expected that on the RTD preview, there is no caching of wheels?
I think so. IIRC, only downloads from PyPI are cached by piplite, and not those from jsDelivr, so we'll need to rely on the caching available from the CDN and download them everytime.