Tom Nicholas
Tom Nicholas
> Is that something we would want to do for `open_groups`? I think guessing the engine would be useful, but remembering that if the implementation of `open_groups` relies on a...
Using icechunk to store intermediate data might be helpful for resuming computations - each completed stage of the plan would write a new commit to one icechunk store that holds...
I really like the idea of committing after every n tasks!
:+1: I think any time we can move "automatic" inference of grid information out of xGCM we should. It should live in packages that explicitly support certain conventions instead.
See https://github.com/pydata/xarray/issues/8965 and #10326 for a more current discussion of this same idea. #10327 would be one way to close this issue.
You'll probably find [this xarray issue](https://github.com/pydata/xarray/issues/4285#issuecomment-1200110315) interesting, because the same restrictions for xarray there will also apply to cubed (i.e. it needs to know the shape and dtype). Essentially Awkward...
> in particular that computed arrays have known shapes before the computation is run, which might be a challenge for awkward array! The issue I linked is about how awkward...
A "fuzzy memory bound" is also related to the issue I just raised: https://github.com/cubed-dev/cubed/issues/749
I'm still struggling to imagine what the API for using this idea in Cubed would look like. Icechunk has the concept of a [`ChangeSet`](https://github.com/earth-mover/icechunk/blob/9e63119748366bf6b55b2f9bdfbd0125713a981f/icechunk/src/change_set.rs#L21), which contains more than enough information...
I think you're right @tomwhite. The `ChangeSet` tells you the chunks that changed, from which you can derive the region that changed. Note there could be multiple input datasets in...