zarr-python
zarr-python copied to clipboard
Integrating tutorial.ipynb into readthedocs
I was able to get started with integrating the tutorial.ipynb file into the readthedocs using nbsphinx. Though I still have issues with the tutorial.ipynb file. It still needs some re-editing.
Also, i am not able to resolve importing numpydoc when building the fiiles for the tutorial page.
Thanks, @GbotemiB. I've triggered the build here, but not that in the rendered output at https://zarr--1163.org.readthedocs.build/en/1163/tutorial_nb.html some of the RST markup is not being properly rendered (since notebooks use markdown):

Thanks, @GbotemiB. I've triggered the build here, but not that in the rendered output at https://zarr--1163.org.readthedocs.build/en/1163/tutorial_nb.html some of the RST markup is not being properly rendered (since notebooks use markdown):
this has been fixed in the recent commit
@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?
@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?
If the data files are not present, it won't affect the notebook. The notebook is set to not rerun so as to not affect the output. The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.
@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?
If the data files are not present, it won't affect the notebook. The notebook is set to not rerun so as to not affect the output. The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.
there are two ways I could co about the output of the notebook. I could run the notebook locally and set it to not rerun. This way the output will be saved. The second way is to clear the output, then set the notebook to run. The notebook will run on checkout. But the rechunking tutorial part will take a long while. let me know your thoughts about this?
The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.
Understood. Thanks for the caution!
The notebook will run on checkout
On readthedocs, then, right? Knowing the limited number of processes we have there, that wouldn't be great. Two other options that come to mind are: caching the outputs in a GHA or reducing the size of the rechunking section.
Thoughts (from anyone) welcome.
Thanks for working through this Emmanuel! 🙏
Some other things we might want to look at (in addition to nbsphinx) mentioned in issue ( https://github.com/zarr-developers/zarr-python/issues/514 ) are...:
No strong feelings about any of these options (including nbsphinx). Though it may be helpful to have a way to start with .rst and generate notebooks and other content from that.
Another thing we might consider (especially as we have a few more people trying to get started with Zarr) is some sort of Binder integration, which would allow users to spin up the tutorial in the cloud and play around with it (without having to figure out how to install locally; at least not initially).
Thanks for working through this Emmanuel! pray
Some other things we might want to look at (in addition to nbsphinx) mentioned in issue ( #514 ) are...:
* [sphinx-gallery](https://sphinx-gallery.github.io/stable/index.html) * [`ipython` directive (for Sphinx)](https://ipython.readthedocs.io/en/stable/sphinxext.html) * [jupyter-sphinx](https://jupyter-sphinx.readthedocs.io/en/latest/) * [myst-nb](https://myst-nb.readthedocs.io/en/latest/)No strong feelings about any of these options (including nbsphinx). Though it may be helpful to have a way to start with
.rstand generate notebooks and other content from that.Another thing we might consider (especially as we have a few more people trying to get started with Zarr) is some sort of Binder integration, which would allow users to spin up the tutorial in the cloud and play around with it (without having to figure out how to install locally; at least not initially).
Hi @jakirkham, I went through the other things you mentioned. Here is my thought
- I like the idea of using *
ipythondirective (for Sphinx). It gives room to still use the initial files for the docs tutorial in rst. The same goes for * jupyter-sphinx. This is a better alternative to integrating the whole jupyterNB into the docs. I think I should hold on with the integration with JupyterNB. I think I will try to implement *ipythondirective (for Sphinx) instead. What do you think @jakirkham ? - Regarding * myst-nb. I think this might require changing the files. I really dont know. I am checking this out.
- I checked out Binder. It really will be helpful for beginners. There is already a tutorial in JupyterNB. By the way, since we Outreachy interns need issues to work on, We could work on creating more tutorial in JupyterNB that can be implemented on Binder. I dont know if this is a good idea @joshmoore and @MSanKeys963. I can also take up the task.
Codecov Report
Merging #1163 (5a166b8) into main (eb6d143) will not change coverage. The diff coverage is
n/a.
@@ Coverage Diff @@
## main #1163 +/- ##
=======================================
Coverage 99.95% 99.95%
=======================================
Files 36 36
Lines 14141 14142 +1
=======================================
+ Hits 14134 14135 +1
Misses 7 7
| Impacted Files | Coverage Δ | |
|---|---|---|
| zarr/util.py | 100.00% <0.00%> (ø) |
Hi @joshmoore @jakirkham I would like to work on creating more tutorials that can be implemented on Binder. I would really appreciate pointers on how to get started. Thank you.
@zeelyha: that would be great! Have you seen the tutorials repo? https://github.com/zarr-developers/tutorials
Starting from the notebook here or any of those over there, I'd make a copy and see if you can start modifying the example to do something slightly different. That could be something from https://zarr.readthedocs.io/en/stable/tutorial.html or something you find online (e.g. from twitter)
You can find datasets online either at https://pangeo-forge.org/catalog or https://idr.github.io/ome-ngff-samples/
@joshmoore Yes, I have gone through the repo and I just finished watching the interactive tutorial video. I'd be waiting for a copy of the notebook so I can start modifying the examples. Thank you.
Also do we need to talk with Jason about this?
I'm not sure what you mean. Let's perhaps move this conversation back to https://gitter.im/zarr-developers/outreachy-contributors-dec-2022 to not steal @GbotemiB's PR.
@GbotemiB: what's the status here and how does it relate to #1229?
Integrating a jupyter notebook into the readthedocs to serve as the tutorial file, will require removing the tutorial.rst file which is actually what is being done here. @jakirkham gave a few suggestions regarding other options.
I decided to try using ipython directive (for Sphinx) in #1229.
so far, i have tested it, its actually preferable to using a jupyter notebook and the file structure will still be maintained.
We could decide to go for any of the two.