generate datatree methods
- [x] Closes #10015
- [ ] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in
whats-new.rst - [ ] New functions/methods are listed in
api.rst
Add a script that generates a mixin so Dataset methods are available on DataTree. Uses inspect.signature to re-generate the call signature and a decorator so we can still use *args, **kwargs, and we don't need to populate the method body, making the generation relatively trivial (although maybe not trivial to understand).
This is much clunkier than generate_ops or generate_aggregations. However, we cannot profit from common signatures. Thus
- the docstring is not adapted
- the examples are not adapted
- the generated file needs to be fixed and formatted with ruff before use
However, it's a fraction of the work to do this properly. I am really not sure if this is a good idea - feel free to tell me it's not!
The alternative is to inject everything (as in https://github.com/xarray-contrib/datatree/blob/5f3956ffe80e686dd3df54ee8cef9ff56c158e76/datatree/ops.py#L223). (Or to write all methods out, or create mixin classes that work for all data types...)
@TomNicholas do you think this has a chance to be considered. If not I happy to close. It's obviously only a fraction of the missing methods - can add them if this is considered.
One alternative is to generate the file once and then manually adapt the docstrings. That would be a bit less work than do everything by hand. (It's quite annoying to always write xr.map_over_datasets(lambda ds: ds.rename(...), dt) instead of dt.rename(...), etc. for many of the dataset manipulations.)
Hi @mathause , sorry for the super slow reply here.
This is actually basically how old datatree used to work. But when we moved everything upstream into xarray, we made an effort to do it the "proper" way, by adding to generate_ops.py/generate_aggregations.py. The logic being that Dataset isn't really special once inside xarray - all similar methods should be generated from a common template, of which Dataset is just one realization.
Ok thanks - happy to close the PR in this case (although I think these are the methods that are currently not generated in generate_ops.py / generate_aggregations.py. Probably because they cannot be generalized (i.e. contain method-specific logic).)