xarray Integrate cubed in xarray

Integrate cubed in xarray

Open TomNicholas opened this issue 1 year ago • 2 comments

Initial attempt to get cubed working within xarray, as an alternative to dask.

[x] Closes #6807, at least for the case of cubed
[ ] Tests added
[ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
[ ] New functions/methods are listed in api.rst

I've added a manager kwarg to the .chunk methods so you can do da.chunk(manager="cubed") to convert to a chunked cubed.CoreArray, with the default still being da.chunk(manager="dask"). (I couldn't think of a better name than "manager", as "backend" and "executor" are already taken.)

~~At the moment it should work except for an import error that I don't understand, see below.~~

To complete this PR we would also need:

[x] Cubed to expose the correct array type consistently https://github.com/tomwhite/cubed/issues/123
[ ] A cubed version of apply_gufunc https://github.com/tomwhite/cubed/pull/119
[ ] Re-route xarray.apply_ufunc through cubed.apply_gufunc instead of dask's apply_gufunc when appropriate
[ ] A test suite for wrapping cubed arrays, which would be best done via #6894
[ ] Ideally also generalise xarray.map_blocks to work on cubed arrays

cc @tomwhite

Sep 10 '22 22:09 TomNicholas

I think the manager keyword will also need adding to open_zarr, open_dataset and to_zarr.

I'm interested in trying this out on some of our genomics use cases in sgkit (see https://github.com/pystatgen/sgkit/issues/908), so please let me know when you think it's ready to try @TomNicholas.

Sep 22 '22 15:09 tomwhite

This was more abstract than expected

Yeah I was kind of asking whether this was unnecessarily abstract, and if there was a simpler design that achieved the same flexibility.

Sep 22 '22 17:09 TomNicholas

@TomNicholas it might be good to rebase this now that #7067 is in.

Oct 20 '22 11:10 tomwhite

xarray xarray copied to clipboard

Integrate cubed in xarray

xarray
xarray copied to clipboard