xarray icon indicating copy to clipboard operation
xarray copied to clipboard

DAS-2155 - Merge datatree documentation into main docs.

Open owenlittlejohns opened this issue 1 year ago • 11 comments

This PR migrates documentation for datatree from xarray/datatree_/docs into the main doc directory.

It seems a little iffy that a bunch of the references to the DataTree class, associated methods and other migrated functions all need to point to lower level locations (such as xarray.core.datatree.DataTree. When DataTree and friends are added to the public xarray API these references can be significantly simplified. This makes me wonder if we should couple the documentation change with that ability to access these classes from the main public API? (And if so, whether this PR is the time to do it?)

  • [x] Completes documentation migration step of #8572
  • ~~Tests added~~
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst

owenlittlejohns avatar May 18 '24 02:05 owenlittlejohns

I would be fine with adding DataTree and open_datatree to the top-level module in this PR: adding documentation for DataTree to me means that we're publicly exposing it.

The only alternative I can think of would be to do this in a separate PR and merge that, then sync this PR with main.

keewis avatar May 19 '24 11:05 keewis

I agree with @keewis

The only alternative I can think of would be to do this in a separate PR and merge that, then sync this PR with main.

This might a little neater, but the result is the same either way.

TomNicholas avatar May 20 '24 16:05 TomNicholas

@TomNicholas - So the alternative here is exposing things in the public API, then updating this PR to have the correct references and merging this afterwards? (Just checking I read your last comment correctly)

Maybe I'm being a bit keen, but I'm tempted to keep both pieces (adding to the main API and including DataTree in the documentation) in the same PR to keep the documentation and public API in-sync.

owenlittlejohns avatar May 21 '24 00:05 owenlittlejohns

I have added open_datatree, DataTree, register_datatree_accessor, map_over_subtree and assert_isomorphic to the public API. The DataTree public API has some other exceptions in it (TreeIsomorphismError, InvalidTreeError and NotFoundInTreeError), but I was going by the things listed in #8572.

owenlittlejohns avatar May 21 '24 04:05 owenlittlejohns

(TreeIsomorphismError, InvalidTreeError and NotFoundInTreeError)

These should probably be added to the api.rst page after MergeError.

TomNicholas avatar May 28 '24 14:05 TomNicholas

@TomNicholas - to clarify your previous comment about those exceptions: would you like those 3 exceptions added to the public xarray API (TreeIsomorphismError, InvalidTreeError, NotFoundInTreeError)? Or would you just want them added to api.rst with their full current paths (pointing to e.g., xarray.core.treenode.InvalidTreeError).

I can definitely add them to the public API, but hadn't as those exceptions were not listed in #8572.

owenlittlejohns avatar May 28 '24 19:05 owenlittlejohns

@owenlittlejohns however it's already done for xarray's MergeError (which I happen to remember is listed in the public api.rst page). How exactly we do this isn't particularly important (hence why I forgot to include it in #8572), we just want to be consistent with other error types that we already have exposed publicly.

TomNicholas avatar May 28 '24 19:05 TomNicholas

Thanks for the guidance. I just added those three exceptions to the public API, so they are now consistent with MergeError.

owenlittlejohns avatar May 28 '24 21:05 owenlittlejohns

I approved the changes here, but I don't know how to mark this as waiting for an event. (numpy 2)

flamingbear avatar Jun 06 '24 18:06 flamingbear

This is mergable, but unfortunately contingent on the outcome of https://github.com/pydata/xarray/issues/9077. Perhaps we should think about exactly how much of the docs wording would need to be changed if #9077 were accepted...

TomNicholas avatar Jun 11 '24 21:06 TomNicholas

Trying to track outstanding items:

  • [x] Once #9158 is merged, this PR could also update the imports for open_datatree and DataTree. (complete: b303d6255)
  • [x] Review documentation following #9063 merge to ensure coordinate inheritance is described.
  • [x] Consolidate documentation terminology terms.
  • [x] Add Documentation for open_groups #9243

owenlittlejohns avatar Jun 26 '24 15:06 owenlittlejohns

Adding another list of things to make sure get completed.

  • [ ] ~~remove DataTree.open_mfdatatree references. Seems vestigial from old datatree repo~~
  • [x] ~~See if I need to raise map_over_subtree and open_datatree to top level. (getting warning locally, but not on RTD~~ I don't think I do. my build error went away

Now that I've reviewed more of the docs, should I actually leave in the future development of open_mfdatatree? Pinging @TomNicholas since you upthumbed this before. I'm leaning towards leaving them for now.

flamingbear avatar Jul 24 '24 19:07 flamingbear

  • [x] The errors with DataTree.assign and DataTree.update need to be resolved/understood before I can update the documentation for Abe's Brother Herbert in hierarchical-data.rst #9285
  • [x] Finish the rest of hierarchical-data.rst reviewing for inherited coords refs.

flamingbear avatar Jul 26 '24 21:07 flamingbear

The diff for commits I have added updating the PR for inheritance: https://github.com/pydata/xarray/pull/9033/files/b303d6255d762f0a82188ff6446b25a7bc82aadb..ec341ad8b9ea4c362b8c7ae0f9b7ca9aa6782bf9

I'm going to put this down until the output stabilizes and I can go look through the examples again. edit and here is the current docs for this PR: https://xray--9033.org.readthedocs.build/en/9033/

flamingbear avatar Aug 02 '24 19:08 flamingbear

I wonder if it would make more sense to make this branch a new branch on the main xarray repo rather than on a fork (but keep merging in main to keep datatree_docs up to date). That way we could submit PRs purely about improving documentation for datatree but keep it all unreleased until we're ready to hit the big red launch button.

TomNicholas avatar Sep 09 '24 15:09 TomNicholas

make this branch a new branch on the main xarray repo rather than on a fork

We decided to push forward on this and try to get it all reviewed/working and let later doc PRs happen on main when this is merged and releases datatree

flamingbear avatar Sep 10 '24 19:09 flamingbear

Crud. I was ready to merge this. And I'm hoping we still can and then update the next problem I'm about to show. I think both of these should have the same behavior.

bart = xr.DataTree(name="Bart")
lisa = xr.DataTree(name="Lisa")
homer = xr.DataTree(name="Homer", children={"Bart": bart, "Lisa": lisa})
print(list(lisa.siblings))

homer2 = xr.DataTree(name="Homer")
homer2.children = {"Bart": bart, "Lisa": lisa}
print(list(lisa.siblings))
[]
['Bart']

@TomNicholas

flamingbear avatar Sep 12 '24 14:09 flamingbear

So with the exception of the behavior in the last cell. I think the documentation is relatively solid. I have gone over most of the documetation for Datatree areas and scanned for broken examples. I don't think there are any remaining, but keep your eyes out. There may be some inelegant or unnecessary examples. The coordinate inheritance can be 100% removed and replaced if it's terrible.

Hilights are

Quick Overview: https://xray--9033.org.readthedocs.build/en/9033/getting-started-guide/quick-overview.html#datatrees

Data Structures: https://xray--9033.org.readthedocs.build/en/9033/user-guide/data-structures.html#datatree

Hierarcical Data https://xray--9033.org.readthedocs.build/en/9033/user-guide/hierarchical-data.html

@TomNicholas @owenlittlejohns @eni-awowale @shoyer @keewis

Are we going to merge this with issues and then clean them up as they're on main? 🙏

flamingbear avatar Sep 12 '24 14:09 flamingbear

update the next problem I'm about to show

I'm pretty sure I know what's going on here (we currently copy inputs to the constructor but not to the .children setter) but if you could raise a separate issue for that we can proceed with this PR.

Are we going to merge this with issues and then clean them up as they're on main?

We should alter some of these examples slightly when a solution to #9475 is merged, and I have a few small comments, but I would like to merge this now and then fix those docs along with the code fixes, as long as @shoyer is okay with it!

TomNicholas avatar Sep 12 '24 15:09 TomNicholas

We should alter some of these examples slightly when a solution to https://github.com/pydata/xarray/issues/9475 is merged, and I have a few small comments, but I would like to merge this now and then fix those docs along with the code fixes, as long as @shoyer is okay with it!

To be crystal clear. We can merge this and update the docs with small tweaks when #9475 is updated?

Also to add the open_goups and other additions?

I'm 100% for this but only if others are ok with that.

flamingbear avatar Sep 12 '24 17:09 flamingbear

Last comments to address are being added here so when I merge this we can find them easily:

  • [ ] Where to mention open_groups https://github.com/pydata/xarray/pull/9033#discussion_r1757309976
  • [ ] example model dataset for #9437? https://github.com/pydata/xarray/pull/9033#discussion_r1757361605
  • [ ] need open_groups example https://github.com/pydata/xarray/pull/9033#discussion_r1757098697
  • [ ] Need example for aligned and unaligned data set https://github.com/pydata/xarray/pull/9033#discussion_r1757096371

This is ready but for these. Let's do them off of main and merge this.

flamingbear avatar Sep 12 '24 21:09 flamingbear

Amazing work @flamingbear ! Hopefully we can now return to a saner branch structure for the final push!

TomNicholas avatar Sep 12 '24 22:09 TomNicholas