zarr-python Consider moving the tutorial to a Jupyter notebook

Would others be interested in having the content in tutorial.rst moved to a Jupyter notebook? We could then use the nbsphinx Sphinx extension (https://nbsphinx.readthedocs.io) to run and render the notebook in the docs.

The main advantage I see to the .rst -> .ipynb move would be to easily run the tutorial interactively on Binder (we could include a binder link at the top of the tutorial). The main drawback that comes to mind is editing the tutorial would now be different than editing all the other *.rst files we currently have in the docs (and adds Jupyter as a dependency to build the docs).

Happy to open a PR for this, just wanted to see if there was any interest first

Nov 15 '19 18:11 jrbourbeau

For reference, here's an example of a rendered Jupyter notebook in a sphinx docs page with a binder link https://examples.dask.org/dataframe.html

Nov 15 '19 18:11 jrbourbeau

FWIW I think this is a cool idea. Would it be possible to run the notebook as part of CI so we catch any errors or inconsistencies between docs and code? I'm keen that all docs with code examples are run as doctests or equivalent wherever possible.

On Fri, 15 Nov 2019, 18:30 James Bourbeau, [email protected] wrote:

For reference, here's an example of a rendered Jupyter notebook in a sphinx docs page with a binder link https://examples.dask.org/dataframe.html

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zarr-developers/zarr-python/issues/514?email_source=notifications&email_token=AAFLYQS6E4LDXOZINIVSFZ3QT3TFNA5CNFSM4JN6TQ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGKEHI#issuecomment-554476061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLYQXL2JYPPDWMN57BJK3QT3TFNANCNFSM4JN6TQ4Q .

Nov 15 '19 20:11 alimanfoo

The notebook would be run as part of the documentation build process, so the tutorial should automatically remain up to date with the latest code changes. That said, adding a docs build to the CI (xref #369) would be a nice complementary addition

Nov 15 '19 20:11 jrbourbeau

Adding a docs build to the CI would be very nice, there have been times when the RTFD build was broken and we didn't notice for a while.

On Fri, 15 Nov 2019 at 20:24, James Bourbeau [email protected] wrote:

The notebook would be run as part of the documentation build process, so the tutorial should automatically remain up to date with the latest code changes. That said, adding a docs build to the CI (xref #369 https://github.com/zarr-developers/zarr-python/issues/369) would be a nice complementary addition

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zarr-developers/zarr-python/issues/514?email_source=notifications&email_token=AAFLYQUPXE3AEQSCDIC2ASTQT4AQ3A5CNFSM4JN6TQ42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEGTR3Y#issuecomment-554514671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFLYQXW2H3BLD67PAQQHVTQT4AQ3ANCNFSM4JN6TQ4Q .

--

Alistair Miles Head of Epidemiological Informatics Centre for Genomics and Global Health Big Data Institute Li Ka Shing Centre for Health Information and Discovery University of Oxford Old Road Campus Headington Oxford OX3 7LF United Kingdom Phone: +44 (0)1865 743596 or +44 (0)7866 541624 Email: [email protected] Web: http://a http://purl.org/net/alimanlimanfoo.github.io/ Twitter: @alimanfoo https://twitter.com/alimanfoo

Please feel free to resend your email and/or contact me by other means if you need an urgent reply.

Nov 15 '19 21:11 alimanfoo

Is this something you all are still interested in implementing? I'd be happy to do it

Mar 31 '20 21:03 andrewfulton9

Yeah, that would be much appreciated : ) Thanks @andrewfulton9, feel free to ping me if you have any questions

Mar 31 '20 21:03 jrbourbeau

IDK if it fits are use case, but it might be worth looking at sphinx-gallery.

Mar 31 '20 23:03 jakirkham

I have a quick question about the examples in the current tutorial that are skipped over for doctests. It looks like most of them are because they are looking at cloud storage like aws S3 or Azure block storage. Should I make those runnable in the notebook, or skip over them as they are for doctests?

Apr 18 '20 19:04 andrewfulton9

Good point, thanks for bringing that up @andrewfulton9. We should make sure to include those examples in the notebook, but skip their actual execution. If there's a way we could specify certain code cells shouldn't be executed, that would be ideal. Otherwise, including them as code blocks in a markdown cell would also work. Other suggestions are welcome as always

Apr 21 '20 23:04 jrbourbeau

Sounds good. It might be best to put those cells into Markdown if we are going to make the notebooks executable in Binder to not cause confusion. I'll keep exploring options though.

Apr 23 '20 03:04 andrewfulton9

If you need/want to go 1/2 way, you can use the ipython directive, which will execute and embed the results during the build, while still keeping RST.

It does not have the ability to generate .ipynb though as rst has way more features then markdown, but at least would ensure that docs is up to date.

jupyter-sphinx is similar, but also support widgets and should likely be merged with the ipython-directive at some point.

May 06 '20 23:05 Carreau

Hi @joshmoore , I am from outreachy, Is this documentation task still available, can I take it?

Oct 18 '21 07:10 olusanwo

@olusanwo, sure! I'm sure everyone would look forward to a suggestion for this.

Oct 18 '21 09:10 joshmoore

Hi @joshmoore Sir, I moved Tutorial.rst to an .ipynb file. Please check it out:

https://colab.research.google.com/drive/1qqVY0KxVvEFyifPgWpbyG2P5RdQj9_TZ?usp=sharing

Here are a list of pointers of what I did:

Used Google Colab for building this notebook over Jupyterlab as it eliminates the need to add Jupyter as dependency.
Capitalized first letter of each word of the headings, expect for artcles and prepositions.
Added output for Blosc defaults to let the user know while switching default compressor.
Some tutorials are not interactive because of external dependency problems (For example, some tutorials under Distributed/Cloud Storage Tutorials are not interactive as they require the user to enter their Azure Account Name and Account Key).

I also noticed that under Changing chunk shapes (rechunking) in Tutorial.rst, in 2nd code block, there's an apstrophe error.

a = zarr.zeros((10000,10000), chunks=(10000, 1), dtype='uint16, store='a.zarr')

This should be:

a = zarr.zeros((10000,10000), chunks=(10000, 1), dtype='uint16', store='a.zarr')

Some problems I faced:

zarr.consolidate_metadata(store) shows error 'Access Denied.'
root = zarr.open_consolidated(store) shows 'Keyerror: .zmetadata'
I couldn't find proper support for ipytree for Google Colab.

Please let me know if there any suggested changes that are to be made. Thank you!

Mar 24 '22 06:03 sudoyolo

Repeating here from a recent chat with @sudoyolo: looking forward to seeing the notebook as a PR. It will need some additional work to integrate it into the readthedocs output. :+1:

Mar 30 '22 13:03 joshmoore

In order to integrate the notebook into the docs, I highly recommend myst-nb. We use it on all our documentation sites. It supports all of the fancy sphinx syntax in markdown via myst.

Mar 30 '22 14:03 rabernat

Hi @joshmoore, I think this issue has been resolved on issue #996. you can as well close the issue.

Oct 10 '22 01:10 GbotemiB

That PR unfortunately appears to be closed (as opposed to merge). Contributions here would still be welcome 🙂

Oct 10 '22 06:10 jakirkham

Hi @joshmoore and @MSanKeys963 I am an outreachy applicant and I will like to work on this issue

Oct 21 '22 05:10 steph237

Hi Stephanie, thanks for offering to help! 🙏

Think Emmanuel already picked this up with PR ( https://github.com/zarr-developers/zarr-python/pull/1163 ), but please feel free to grab another issue.

Thanks again! 🙂

Oct 21 '22 06:10 jakirkham