zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Integrating tutorial.ipynb into readthedocs

Open GbotemiB opened this issue 3 years ago • 15 comments
trafficstars

I was able to get started with integrating the tutorial.ipynb file into the readthedocs using nbsphinx. Though I still have issues with the tutorial.ipynb file. It still needs some re-editing.

Also, i am not able to resolve importing numpydoc when building the fiiles for the tutorial page.

GbotemiB avatar Oct 10 '22 20:10 GbotemiB

Thanks, @GbotemiB. I've triggered the build here, but not that in the rendered output at https://zarr--1163.org.readthedocs.build/en/1163/tutorial_nb.html some of the RST markup is not being properly rendered (since notebooks use markdown):

Screen Shot 2022-10-11 at 08 58 14

joshmoore avatar Oct 11 '22 06:10 joshmoore

Thanks, @GbotemiB. I've triggered the build here, but not that in the rendered output at https://zarr--1163.org.readthedocs.build/en/1163/tutorial_nb.html some of the RST markup is not being properly rendered (since notebooks use markdown):

Screen Shot 2022-10-11 at 08 58 14

this has been fixed in the recent commit

GbotemiB avatar Oct 11 '22 13:10 GbotemiB

@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?

joshmoore avatar Oct 11 '22 14:10 joshmoore

@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?

If the data files are not present, it won't affect the notebook. The notebook is set to not rerun so as to not affect the output. The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.

GbotemiB avatar Oct 11 '22 15:10 GbotemiB

@GbotemiB, does the notebook still run to completion on a fresh checkout if all of the data files that you added are not present? If so, can I suggest removing them from this PR?

If the data files are not present, it won't affect the notebook. The notebook is set to not rerun so as to not affect the output. The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.

there are two ways I could co about the output of the notebook. I could run the notebook locally and set it to not rerun. This way the output will be saved. The second way is to clear the output, then set the notebook to run. The notebook will run on checkout. But the rechunking tutorial part will take a long while. let me know your thoughts about this?

GbotemiB avatar Oct 11 '22 15:10 GbotemiB

The reason why I didnt set the notebook to rerun is because the rechunking part of the tutorial takes alot of memory and alot of time to complete.

Understood. Thanks for the caution!

The notebook will run on checkout

On readthedocs, then, right? Knowing the limited number of processes we have there, that wouldn't be great. Two other options that come to mind are: caching the outputs in a GHA or reducing the size of the rechunking section.

Thoughts (from anyone) welcome.

joshmoore avatar Oct 11 '22 17:10 joshmoore

Thanks for working through this Emmanuel! 🙏

Some other things we might want to look at (in addition to nbsphinx) mentioned in issue ( https://github.com/zarr-developers/zarr-python/issues/514 ) are...:

No strong feelings about any of these options (including nbsphinx). Though it may be helpful to have a way to start with .rst and generate notebooks and other content from that.

Another thing we might consider (especially as we have a few more people trying to get started with Zarr) is some sort of Binder integration, which would allow users to spin up the tutorial in the cloud and play around with it (without having to figure out how to install locally; at least not initially).

jakirkham avatar Oct 12 '22 21:10 jakirkham

Thanks for working through this Emmanuel! pray

Some other things we might want to look at (in addition to nbsphinx) mentioned in issue ( #514 ) are...:

* [sphinx-gallery](https://sphinx-gallery.github.io/stable/index.html)

* [`ipython` directive (for Sphinx)](https://ipython.readthedocs.io/en/stable/sphinxext.html)

* [jupyter-sphinx](https://jupyter-sphinx.readthedocs.io/en/latest/)

* [myst-nb](https://myst-nb.readthedocs.io/en/latest/)

No strong feelings about any of these options (including nbsphinx). Though it may be helpful to have a way to start with .rst and generate notebooks and other content from that.

Another thing we might consider (especially as we have a few more people trying to get started with Zarr) is some sort of Binder integration, which would allow users to spin up the tutorial in the cloud and play around with it (without having to figure out how to install locally; at least not initially).

Hi @jakirkham, I went through the other things you mentioned. Here is my thought

  • I like the idea of using * ipython directive (for Sphinx). It gives room to still use the initial files for the docs tutorial in rst. The same goes for * jupyter-sphinx. This is a better alternative to integrating the whole jupyterNB into the docs. I think I should hold on with the integration with JupyterNB. I think I will try to implement * ipython directive (for Sphinx) instead. What do you think @jakirkham ?
  • Regarding * myst-nb. I think this might require changing the files. I really dont know. I am checking this out.
  • I checked out Binder. It really will be helpful for beginners. There is already a tutorial in JupyterNB. By the way, since we Outreachy interns need issues to work on, We could work on creating more tutorial in JupyterNB that can be implemented on Binder. I dont know if this is a good idea @joshmoore and @MSanKeys963. I can also take up the task.

GbotemiB avatar Oct 12 '22 23:10 GbotemiB

Codecov Report

Merging #1163 (5a166b8) into main (eb6d143) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1163   +/-   ##
=======================================
  Coverage   99.95%   99.95%           
=======================================
  Files          36       36           
  Lines       14141    14142    +1     
=======================================
+ Hits        14134    14135    +1     
  Misses          7        7           
Impacted Files Coverage Δ
zarr/util.py 100.00% <0.00%> (ø)

codecov[bot] avatar Oct 13 '22 16:10 codecov[bot]

Hi @joshmoore @jakirkham I would like to work on creating more tutorials that can be implemented on Binder. I would really appreciate pointers on how to get started. Thank you.

zeelyha avatar Oct 13 '22 18:10 zeelyha

@zeelyha: that would be great! Have you seen the tutorials repo? https://github.com/zarr-developers/tutorials

Starting from the notebook here or any of those over there, I'd make a copy and see if you can start modifying the example to do something slightly different. That could be something from https://zarr.readthedocs.io/en/stable/tutorial.html or something you find online (e.g. from twitter)

You can find datasets online either at https://pangeo-forge.org/catalog or https://idr.github.io/ome-ngff-samples/

joshmoore avatar Oct 14 '22 11:10 joshmoore

@joshmoore Yes, I have gone through the repo and I just finished watching the interactive tutorial video. I'd be waiting for a copy of the notebook so I can start modifying the examples. Thank you.

zeelyha avatar Oct 14 '22 14:10 zeelyha

Also do we need to talk with Jason about this?

I'm not sure what you mean. Let's perhaps move this conversation back to https://gitter.im/zarr-developers/outreachy-contributors-dec-2022 to not steal @GbotemiB's PR.

joshmoore avatar Oct 14 '22 16:10 joshmoore

@GbotemiB: what's the status here and how does it relate to #1229?

joshmoore avatar Nov 01 '22 15:11 joshmoore

Integrating a jupyter notebook into the readthedocs to serve as the tutorial file, will require removing the tutorial.rst file which is actually what is being done here. @jakirkham gave a few suggestions regarding other options. I decided to try using ipython directive (for Sphinx) in #1229.

so far, i have tested it, its actually preferable to using a jupyter notebook and the file structure will still be maintained.

We could decide to go for any of the two.

GbotemiB avatar Nov 01 '22 16:11 GbotemiB