flox icon indicating copy to clipboard operation
flox copied to clipboard

Support cumsum, cumprod

Open dcherian opened this issue 3 years ago • 4 comments

Supporting just numpy should be relatively easy. This will also work for method="blockwise" by default.

We may want to rename groupby_reduce to groupby_agg?

For dask proper, we'll need to use dask.array.cumreduction instead of dask.array.blockwise + dask.array.reductions._tree_reduce

dcherian avatar Apr 28 '22 16:04 dcherian

I tried looking into this a while ago but I got stuck, because I found no examples of an aggregation where the shape stays the same. If you have more guidelines/ideas where to look it would be appreciated.

Illviljan avatar Jun 01 '23 20:06 Illviljan

Great to hear. Warning: This is going to be quite complicated :)

Here's how dask implements cumsum: https://docs.dask.org/en/stable/_modules/dask/array/reductions.html#cumsum

We'll need something like that with custom binop and merge.

I would try to get method="sequential" working first.

I would also try really hard to just reuse the cumreduction building block if we can. The annoyance is that we will need to propagate array and group_idx so something like https://github.com/xarray-contrib/flox/blob/0d353ec14c79c4c5123623f00555843324041b37/flox/aggregations.py#L359 should be helpful.

dcherian avatar Jun 01 '23 21:06 dcherian

Ooooh I forgot to mention, just getting the pure numpy version to work would be a great step forward :) We can always start there.

dcherian avatar Jun 02 '23 14:06 dcherian