xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Use cumsum from flox

Open Illviljan opened this issue 2 months ago • 1 comments

  • [x] Closes #6528
  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

The non-flox version reduces chunksizes significantly:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=False):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<getitem, shape=(5,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

With flox the chunksize is retained:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=True):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<_finalize_scan, shape=(5,), dtype=int64, chunksize=(5,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

Illviljan avatar Dec 06 '25 13:12 Illviljan

Tests are passing now! But there's a lot of deactivated options left and there was quite a bit of extra code in _flox_reduce xarray_reduce, not sure how much of it was to deal with reduction operations only.

Illviljan avatar Dec 07 '25 12:12 Illviljan