zarr-python
zarr-python copied to clipboard
Consider moving the tutorial to a Jupyter notebook
Would others be interested in having the content in tutorial.rst moved to a Jupyter notebook? We could then use the nbsphinx Sphinx extension (https://nbsphinx.readthedocs.io) to run and render the notebook in the docs.
The main advantage I see to the .rst -> .ipynb move would be to easily run the tutorial interactively on Binder (we could include a binder link at the top of the tutorial). The main drawback that comes to mind is editing the tutorial would now be different than editing all the other *.rst files we currently have in the docs (and adds Jupyter as a dependency to build the docs).
Happy to open a PR for this, just wanted to see if there was any interest first
For reference, here's an example of a rendered Jupyter notebook in a sphinx docs page with a binder link https://examples.dask.org/dataframe.html
FWIW I think this is a cool idea. Would it be possible to run the notebook as part of CI so we catch any errors or inconsistencies between docs and code? I'm keen that all docs with code examples are run as doctests or equivalent wherever possible.
On Fri, 15 Nov 2019, 18:30 James Bourbeau, [email protected] wrote:
For reference, here's an example of a rendered Jupyter notebook in a sphinx docs page with a binder link https://examples.dask.org/dataframe.html
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zarr-developers/zarr-python/issues/514?email_source=notifications&email_token=AAFLYQS6E4LDXOZINIVSFZ3QT3TFNA5CNFSM4JN6TQ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGKEHI#issuecomment-554476061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLYQXL2JYPPDWMN57BJK3QT3TFNANCNFSM4JN6TQ4Q .
The notebook would be run as part of the documentation build process, so the tutorial should automatically remain up to date with the latest code changes. That said, adding a docs build to the CI (xref #369) would be a nice complementary addition
Adding a docs build to the CI would be very nice, there have been times when the RTFD build was broken and we didn't notice for a while.
On Fri, 15 Nov 2019 at 20:24, James Bourbeau [email protected] wrote:
The notebook would be run as part of the documentation build process, so the tutorial should automatically remain up to date with the latest code changes. That said, adding a docs build to the CI (xref #369 https://github.com/zarr-developers/zarr-python/issues/369) would be a nice complementary addition
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zarr-developers/zarr-python/issues/514?email_source=notifications&email_token=AAFLYQUPXE3AEQSCDIC2ASTQT4AQ3A5CNFSM4JN6TQ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGTR3Y#issuecomment-554514671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLYQXW2H3BLD67PAQQHVTQT4AQ3ANCNFSM4JN6TQ4Q .
--
Alistair Miles Head of Epidemiological Informatics Centre for Genomics and Global Health Big Data Institute Li Ka Shing Centre for Health Information and Discovery University of Oxford Old Road Campus Headington Oxford OX3 7LF United Kingdom Phone: +44 (0)1865 743596 or +44 (0)7866 541624 Email: [email protected] Web: http://a http://purl.org/net/alimanlimanfoo.github.io/ Twitter: @alimanfoo https://twitter.com/alimanfoo
Please feel free to resend your email and/or contact me by other means if you need an urgent reply.
Is this something you all are still interested in implementing? I'd be happy to do it
Yeah, that would be much appreciated : ) Thanks @andrewfulton9, feel free to ping me if you have any questions
IDK if it fits are use case, but it might be worth looking at sphinx-gallery.
I have a quick question about the examples in the current tutorial that are skipped over for doctests. It looks like most of them are because they are looking at cloud storage like aws S3 or Azure block storage. Should I make those runnable in the notebook, or skip over them as they are for doctests?
Good point, thanks for bringing that up @andrewfulton9. We should make sure to include those examples in the notebook, but skip their actual execution. If there's a way we could specify certain code cells shouldn't be executed, that would be ideal. Otherwise, including them as code blocks in a markdown cell would also work. Other suggestions are welcome as always
Sounds good. It might be best to put those cells into Markdown if we are going to make the notebooks executable in Binder to not cause confusion. I'll keep exploring options though.
If you need/want to go 1/2 way, you can use the ipython directive, which will execute and embed the results during the build, while still keeping RST.
It does not have the ability to generate .ipynb though as rst has way more features then markdown, but at least would ensure that docs is up to date.
jupyter-sphinx is similar, but also support widgets and should likely be merged with the ipython-directive at some point.
Hi @joshmoore , I am from outreachy, Is this documentation task still available, can I take it?
@olusanwo, sure! I'm sure everyone would look forward to a suggestion for this.
Hi @joshmoore Sir, I moved Tutorial.rst to an .ipynb file. Please check it out:
https://colab.research.google.com/drive/1qqVY0KxVvEFyifPgWpbyG2P5RdQj9_TZ?usp=sharing
Here are a list of pointers of what I did:
- Used Google Colab for building this notebook over Jupyterlab as it eliminates the need to add Jupyter as dependency.
- Capitalized first letter of each word of the headings, expect for artcles and prepositions.
- Added output for Blosc defaults to let the user know while switching default compressor.
- Some tutorials are not interactive because of external dependency problems (For example, some tutorials under Distributed/Cloud Storage Tutorials are not interactive as they require the user to enter their Azure Account Name and Account Key).
I also noticed that under Changing chunk shapes (rechunking) in Tutorial.rst, in 2nd code block, there's an apstrophe error.
a = zarr.zeros((10000,10000), chunks=(10000, 1), dtype='uint16, store='a.zarr')
This should be:
a = zarr.zeros((10000,10000), chunks=(10000, 1), dtype='uint16', store='a.zarr')
Some problems I faced:
- zarr.consolidate_metadata(store) shows error 'Access Denied.'
- root = zarr.open_consolidated(store) shows 'Keyerror: .zmetadata'
- I couldn't find proper support for ipytree for Google Colab.
Please let me know if there any suggested changes that are to be made. Thank you!
Repeating here from a recent chat with @sudoyolo: looking forward to seeing the notebook as a PR. It will need some additional work to integrate it into the readthedocs output. :+1:
In order to integrate the notebook into the docs, I highly recommend myst-nb. We use it on all our documentation sites. It supports all of the fancy sphinx syntax in markdown via myst.
Hi @joshmoore, I think this issue has been resolved on issue #996. you can as well close the issue.
That PR unfortunately appears to be closed (as opposed to merge). Contributions here would still be welcome 🙂
Hi @joshmoore and @MSanKeys963 I am an outreachy applicant and I will like to work on this issue
Hi Stephanie, thanks for offering to help! 🙏
Think Emmanuel already picked this up with PR ( https://github.com/zarr-developers/zarr-python/pull/1163 ), but please feel free to grab another issue.
Thanks again! 🙂