xarray
xarray copied to clipboard
DAS-2155 - Merge datatree documentation into main docs.
This PR migrates documentation for datatree from xarray/datatree_/docs into the main doc directory.
It seems a little iffy that a bunch of the references to the DataTree class, associated methods and other migrated functions all need to point to lower level locations (such as xarray.core.datatree.DataTree. When DataTree and friends are added to the public xarray API these references can be significantly simplified. This makes me wonder if we should couple the documentation change with that ability to access these classes from the main public API? (And if so, whether this PR is the time to do it?)
- [x] Completes documentation migration step of #8572
- ~~Tests added~~
- [x] User visible changes (including notable bug fixes) are documented in
whats-new.rst - [x] New functions/methods are listed in
api.rst
I would be fine with adding DataTree and open_datatree to the top-level module in this PR: adding documentation for DataTree to me means that we're publicly exposing it.
The only alternative I can think of would be to do this in a separate PR and merge that, then sync this PR with main.
I agree with @keewis
The only alternative I can think of would be to do this in a separate PR and merge that, then sync this PR with main.
This might a little neater, but the result is the same either way.
@TomNicholas - So the alternative here is exposing things in the public API, then updating this PR to have the correct references and merging this afterwards? (Just checking I read your last comment correctly)
Maybe I'm being a bit keen, but I'm tempted to keep both pieces (adding to the main API and including DataTree in the documentation) in the same PR to keep the documentation and public API in-sync.
I have added open_datatree, DataTree, register_datatree_accessor, map_over_subtree and assert_isomorphic to the public API. The DataTree public API has some other exceptions in it (TreeIsomorphismError, InvalidTreeError and NotFoundInTreeError), but I was going by the things listed in #8572.
(
TreeIsomorphismError,InvalidTreeErrorandNotFoundInTreeError)
These should probably be added to the api.rst page after MergeError.
@TomNicholas - to clarify your previous comment about those exceptions: would you like those 3 exceptions added to the public xarray API (TreeIsomorphismError, InvalidTreeError, NotFoundInTreeError)? Or would you just want them added to api.rst with their full current paths (pointing to e.g., xarray.core.treenode.InvalidTreeError).
I can definitely add them to the public API, but hadn't as those exceptions were not listed in #8572.
@owenlittlejohns however it's already done for xarray's MergeError (which I happen to remember is listed in the public api.rst page). How exactly we do this isn't particularly important (hence why I forgot to include it in #8572), we just want to be consistent with other error types that we already have exposed publicly.
Thanks for the guidance. I just added those three exceptions to the public API, so they are now consistent with MergeError.
I approved the changes here, but I don't know how to mark this as waiting for an event. (numpy 2)
This is mergable, but unfortunately contingent on the outcome of https://github.com/pydata/xarray/issues/9077. Perhaps we should think about exactly how much of the docs wording would need to be changed if #9077 were accepted...
Trying to track outstanding items:
- [x] Once #9158 is merged, this PR could also update the imports for
open_datatreeandDataTree. (complete: b303d6255) - [x] Review documentation following #9063 merge to ensure coordinate inheritance is described.
- [x] Consolidate documentation terminology terms.
- [x] Add Documentation for open_groups #9243
Adding another list of things to make sure get completed.
- [ ] ~~remove DataTree.open_mfdatatree references. Seems vestigial from old datatree repo~~
- [x] ~~See if I need to raise map_over_subtree and open_datatree to top level. (getting warning locally, but not on RTD~~ I don't think I do. my build error went away
Now that I've reviewed more of the docs, should I actually leave in the future development of open_mfdatatree? Pinging @TomNicholas since you upthumbed this before. I'm leaning towards leaving them for now.
- [x] The errors with DataTree.assign and DataTree.update need to be resolved/understood before I can update the documentation for Abe's Brother Herbert in hierarchical-data.rst #9285
- [x] Finish the rest of hierarchical-data.rst reviewing for inherited coords refs.
The diff for commits I have added updating the PR for inheritance: https://github.com/pydata/xarray/pull/9033/files/b303d6255d762f0a82188ff6446b25a7bc82aadb..ec341ad8b9ea4c362b8c7ae0f9b7ca9aa6782bf9
I'm going to put this down until the output stabilizes and I can go look through the examples again. edit and here is the current docs for this PR: https://xray--9033.org.readthedocs.build/en/9033/
I wonder if it would make more sense to make this branch a new branch on the main xarray repo rather than on a fork (but keep merging in main to keep datatree_docs up to date). That way we could submit PRs purely about improving documentation for datatree but keep it all unreleased until we're ready to hit the big red launch button.
make this branch a new branch on the main xarray repo rather than on a fork
We decided to push forward on this and try to get it all reviewed/working and let later doc PRs happen on main when this is merged and releases datatree
Crud. I was ready to merge this. And I'm hoping we still can and then update the next problem I'm about to show. I think both of these should have the same behavior.
bart = xr.DataTree(name="Bart")
lisa = xr.DataTree(name="Lisa")
homer = xr.DataTree(name="Homer", children={"Bart": bart, "Lisa": lisa})
print(list(lisa.siblings))
homer2 = xr.DataTree(name="Homer")
homer2.children = {"Bart": bart, "Lisa": lisa}
print(list(lisa.siblings))
[]
['Bart']
@TomNicholas
So with the exception of the behavior in the last cell. I think the documentation is relatively solid. I have gone over most of the documetation for Datatree areas and scanned for broken examples. I don't think there are any remaining, but keep your eyes out. There may be some inelegant or unnecessary examples. The coordinate inheritance can be 100% removed and replaced if it's terrible.
Hilights are
Quick Overview: https://xray--9033.org.readthedocs.build/en/9033/getting-started-guide/quick-overview.html#datatrees
Data Structures: https://xray--9033.org.readthedocs.build/en/9033/user-guide/data-structures.html#datatree
Hierarcical Data https://xray--9033.org.readthedocs.build/en/9033/user-guide/hierarchical-data.html
@TomNicholas @owenlittlejohns @eni-awowale @shoyer @keewis
Are we going to merge this with issues and then clean them up as they're on main? 🙏
update the next problem I'm about to show
I'm pretty sure I know what's going on here (we currently copy inputs to the constructor but not to the .children setter) but if you could raise a separate issue for that we can proceed with this PR.
Are we going to merge this with issues and then clean them up as they're on main?
We should alter some of these examples slightly when a solution to #9475 is merged, and I have a few small comments, but I would like to merge this now and then fix those docs along with the code fixes, as long as @shoyer is okay with it!
We should alter some of these examples slightly when a solution to https://github.com/pydata/xarray/issues/9475 is merged, and I have a few small comments, but I would like to merge this now and then fix those docs along with the code fixes, as long as @shoyer is okay with it!
To be crystal clear. We can merge this and update the docs with small tweaks when #9475 is updated?
Also to add the open_goups and other additions?
I'm 100% for this but only if others are ok with that.
Last comments to address are being added here so when I merge this we can find them easily:
- [ ] Where to mention open_groups https://github.com/pydata/xarray/pull/9033#discussion_r1757309976
- [ ] example model dataset for #9437? https://github.com/pydata/xarray/pull/9033#discussion_r1757361605
- [ ] need open_groups example https://github.com/pydata/xarray/pull/9033#discussion_r1757098697
- [ ] Need example for aligned and unaligned data set https://github.com/pydata/xarray/pull/9033#discussion_r1757096371
This is ready but for these. Let's do them off of main and merge this.
Amazing work @flamingbear ! Hopefully we can now return to a saner branch structure for the final push!