bottleneck icon indicating copy to clipboard operation
bottleneck copied to clipboard

[QUESTION] Plans for an equivalent to pandas groupby?

Open bscully27 opened this issue 4 years ago • 4 comments

I just started using this library, love it.

Quick question - are there any plans for an equivalent to pandas groupby?

Something like: bn.group_by(matrix[:, :2]) .reduce(matrix[:, -1], np.sum)

bscully27 avatar Apr 01 '20 16:04 bscully27

To be honest, I hadn't considered it. Are you looking to avoid a pandas dependency or see this as a way to get more performance?

qwhelan avatar Apr 02 '20 04:04 qwhelan

The latter, to get more performance. I believe pandas groupby has been optimized (not sure if via Cython) but a bottleneck C function would provide substantial speed gains.

bscully27 avatar Apr 02 '20 12:04 bscully27

Okay, thanks for clarifying. I'll keep this open in case someone would like to try out PRs in this vein, but probably won't take a more serious look at this myself until I clear out the backlog.

qwhelan avatar Apr 02 '20 15:04 qwhelan

FYI for anyone looking for these — numbagg has groupby functions. It makes a good complement to bottleneck...

max-sixty avatar Dec 20 '23 03:12 max-sixty