flox icon indicating copy to clipboard operation
flox copied to clipboard

Fast & furious GroupBy operations for dask.array

Results 79 flox issues
Sort by recently updated
recently updated
newest added

Bumps [mamba-org/provision-with-micromamba](https://github.com/mamba-org/provision-with-micromamba) from 12 to 13. Release notes Sourced from mamba-org/provision-with-micromamba's releases. v13 Fix channels and channel-priority settings Support linux-aarch64 and osx-arm64 runners Support sel(unix) Commits a319a81 Readme updates (#88)...

dependencies

### Summary We should be able to improve `method="cohorts"` by first applying the groupby reduction blockwise and then "shuffling". This should substantially reduce the amount of data being moved around....

Closes https://github.com/pydata/xarray/issues/6902 cc @Illviljan @tasansal

- xref #128 - [ ] work on `_choose_engine` - [ ] needs https://github.com/ml31415/numpy-groupies/pull/63

1. We should test with numpy-groupies. CuPy provides [bincount](https://docs.cupy.dev/en/stable/reference/generated/cupy.bincount.html), https://github.com/cupy/cupy/issues/7561 3. We'd have to avoid factorizing with Pandas unfortunately and use `np.searchsorted` or `np.digitize`; or use CuDF?

enhancement
help wanted
array-types

Supporting just numpy should be relatively easy. This will also work for `method="blockwise"` by default. We may want to rename `groupby_reduce` to `groupby_agg`? For dask proper, we'll need to use...

enhancement
help wanted

There have been some upstream fixes: https://github.com/ml31415/numpy-groupies/issues/39#issuecomment-1183091400

Copy over xarray's automation

help wanted