datatree icon indicating copy to clipboard operation
datatree copied to clipboard

Documentation plans

Open TomNicholas opened this issue 3 years ago • 2 comments

Datatree needs some documentation, even if it has to change in future.

I think most of the documentation would remain relevant even after some changes, as long as we keep the same basic data model (e.g. DataTree vs DataGroups, with no hierarchy).

I really like this breakdown for documentation, which theorizes that there are 4 types of documentation, along two axes, as shown in this diagram:

image

Another thing to consider is how the documentation we write now might eventually be incorporated into xarray's documentation upstream. We don't need to duplicate anything, and we want things we write to neatly slot into sections in xarray's existing documentation.

Some ideas:

Tutorials

How-to Guides

(Some of these could possibly go in xarray's Gallery section)

  • [ ] How to define a function which maps an operation over a whole tree.
  • [ ] How to work with multi-resolution data.
  • [ ] How to convert unusual file structures to DataTrees and vice versa. This is where we could discuss tricky gotchas with Zarr files that can't be immediately represented as trees etc.
  • [ ] "How do I" but for various tree manipulation operations. Might need to split up xarray's "how do I" page for clarity. (This one might want to wait for the API to be more solidified)

Explanation

(A lot of this could be grouped under one page on "Working with hierarchical data".)

  • [x] The data model of DataTree (to go in xarray's Data Structures page) https://github.com/xarray-contrib/datatree/pull/103
  • [x] Reading and writing files to and from DataTrees, and how the datatree model compares to various file formats (could go under the "Groups" section of xarray's page on reading and writing data) https://github.com/xarray-contrib/datatree/pull/158.
  • [x] Organising a "family tree" (i.e. how to create and manipulate a tree structure from scratch node-by-node, with no data in it). #179
  • [ ] Tree manipulation - perhaps showing how to calculate things like "depth" and "breadth" #180
  • [ ] Mapping behaviour of a tree
  • [x] File-like access to nodes explained in detail (absolute/relative paths etc.) #179
  • [x] Terminology used for the tree (add to xarray's page) https://github.com/xarray-contrib/datatree/pull/174

Reference

This should be pretty much covered by ensuring that the auto-generated API docs work properly. The hard bit will be copying / duplicating the large API of xarray.Dataset that DataTree inherits.

TomNicholas avatar Feb 17 '22 01:02 TomNicholas

Just took a look at the docs. They are looking great and much more complete than I was expected. Some comments on the current docs site:

  • [x] The index page would be better used as a light intro to the package, rather than duplicated listing of the ToC. I would suggest adding a bit of context about the package and typical use cases. And I would mention the plan to migrate the package to the core xarray project. (Done in #182 )
  • [x] Can we make intersphinx work? It would be nice if references to Xarray objects like Dataset and DataArray included a link to the Xarray docs. Also, it would be great if references to DataTree objects linked to the API Reference. (#183)
  • [ ] The Quick overview page is great. Depending on what you do with the Tutorial page, this could stay as is or expand slightly to cover a few more operations: dt.max(dim=...), dt.sel(time=...), and .to_netcdf(...) would be good examples to consider.

Small nits:

  • [x] :py:class:DataTree in https://xarray-datatree.readthedocs.io/en/latest/data-structures.html#datatree
  • [x] :py:func::DataTree.from_dict and :py:func::~datatree.open_datatree in https://xarray-datatree.readthedocs.io/en/latest/data-structures.html#creating-a-datatree
  • [ ] # TODO update this example using .coords and .data_vars as setters in https://xarray-datatree.readthedocs.io/en/latest/data-structures.html#dictionary-like-methods
  • [x] :py:func::copy in https://xarray-datatree.readthedocs.io/en/latest/data-structures.html#dictionary-like-methods
  • [x] A bunch of py:class::... in https://xarray-datatree.readthedocs.io/en/latest/io.html
  • [x] Odd formatting of DataTree.set_close in https://xarray-datatree.readthedocs.io/en/latest/api.html
  • [x] A bunch of :pull:... in https://xarray-datatree.readthedocs.io/en/latest/whats-new.html

jhamman avatar Jan 04 '23 22:01 jhamman

Thanks @jhamman for the feedback! I've fixed most of the quick things you mentioned, except for for the intersphinx / code API links which I'm a bit stuck on (see #183).

For the tutorial page I was imagining creating some kind of example tutorial dataset and doing a mock analysis of that (see #142, which I've neglected).

TomNicholas avatar Jan 05 '23 19:01 TomNicholas