Tom Nicholas comments

Results 1090 comments of


                                            Tom Nicholas

open_dict_of_datasets function to open any file containing nested groups

> Is that something we would want to do for `open_groups`? I think guessing the engine would be useful, but remembering that if the implementation of `open_groups` relies on a...

Write to Icechunk as intermediate store?

Using icechunk to store intermediate data might be helpful for resuming computations - each completed stage of the plan would write a new commit to one icechunk store that holds...

Write to Icechunk as intermediate store?

I really like the idea of committing after every n tasks!

[FEATURE] Factor out `autogenerate` logic.

:+1: I think any time we can move "automatic" inference of grid information out of xGCM we should. It should live in packages that explicitly support certain conventions instead.

Concurrent loading of coordinate arrays from Zarr

See https://github.com/pydata/xarray/issues/8965 and #10326 for a more current discussion of this same idea. #10327 would be one way to close this issue.

cubed for awkward arrays

You'll probably find [this xarray issue](https://github.com/pydata/xarray/issues/4285#issuecomment-1200110315) interesting, because the same restrictions for xarray there will also apply to cubed (i.e. it needs to know the shape and dtype). Essentially Awkward...

cubed for awkward arrays

> in particular that computed arrays have known shapes before the computation is run, which might be a challenge for awkward array! The issue I linked is about how awkward...

cubed for awkward arrays

A "fuzzy memory bound" is also related to the issue I just raised: https://github.com/cubed-dev/cubed/issues/749

Chunk dependency tracing: Re-compute only necessary chunks in Cubed plan

I'm still struggling to imagine what the API for using this idea in Cubed would look like. Icechunk has the concept of a [`ChangeSet`](https://github.com/earth-mover/icechunk/blob/9e63119748366bf6b55b2f9bdfbd0125713a981f/icechunk/src/change_set.rs#L21), which contains more than enough information...

Chunk dependency tracing: Re-compute only necessary chunks in Cubed plan

I think you're right @tomwhite. The `ChangeSet` tells you the chunks that changed, from which you can derive the region that changed. Note there could be multiple input datasets in...