datatree
datatree copied to clipboard
Indexing tree should create new tree
Inspired by this example in the stackstac documentation
lowcloud = stack[stack["eo:cloud_cover"] < 20]
we should ensure that you can index a datatree with another (isomorphic) datatree, so that the above operation would work even if stack
is a DataTree
instance.
This is another map_over_subtree
-type operation, but it needs careful testing because the __getitem__
function in xarray objects already does so many different things. This won't work with the code as-is because at the moment the DataTree naively dispatches the __getitem__
call down to the wrapped dataset.
https://github.com/xarray-contrib/datatree/blob/cd0695160e261466efc7f51fece02ca9bea2101c/datatree/datatree.py#L238
To clarify, in order for this to work several things need to happen:
-
stack["eo:cloud_cover"]
needs to realise that "eo:cloud_cover" is not a tree, not a group in the tree, but a variable name. Then it needs to select the "eo:cloud_cover" variable from all nodes in the subtree, and return a tree containing only those variables. That in itself requires something like #67 but ignoring nodes for which that variable is not present, at least for deeply-nested trees... -
stack["eo:cloud_cover"] < 20
needs to perform this comparison node-wise, returning a tree of results (hopefully this should already work... -
stack[stack["eo:cloud_cover"] < 20]
needs to use the tree passed to perform a node-wise indexing operation, returning a new tree. (Or we could just.where
)
Basically this is a really complicated usage example because it uses multiple different code-paths within __getitem__
sequentially within one line of user code :sweat_smile: