boost-histogram
boost-histogram copied to clipboard
feat: support full UHI for rebinning
XRef #208
The current interface:
In [1]: import numpy as np
...:
...: import boost_histogram as bh
...:
...: h = bh.Histogram(bh.axis.Regular(10, 0, 1))
...: h.fill(np.random.normal(size=1_000_000))
...: rebin = bh.tag.Rebinner(factor=2)
...: h[::rebin]
Out[1]: Histogram(Regular(5, 0, 1), storage=Double()) # Sum: 341605.0 (1000000.0 with flow)
In [2]: rebin = bh.tag.Rebinner(groups=[1, 2, 3])
In [3]: h[::rebin]
Out[3]: Histogram(Variable([0, 0.1, 0.3, 0.6], metadata=...), storage=Double()) # Sum: 225749.0
In [4]: s = bh.tag.Slicer()
...:
...: h = bh.Histogram(
...: bh.axis.Regular(20, 1, 3), bh.axis.Regular(30, 1, 3),
...: bh.axis.Regular(40, 1, 3)
...: )
...:
...: h[{0: s[:: bh.rebin(groups=[1, 2, 3])]}].axes.size
Out[4]: (3, 30, 40)
In [5]: h[{0: s[:: bh.rebin(groups=[1, 2, 3])], 2: s[:: bh.rebin(g
...: roups=[1, 2 ,3])]}].axes[2].edges
Out[5]: array([1. , 1.05, 1.15, 1.3 ])
- [ ] The code is a bit dirty and I don't know if it is perfectly optimized.
- [ ] How should the code handle flow bins?
- [ ] Is there any edge case that I am missing?
cc: @henryiii @matthewfeickert
Ah, yeah, you probably have to use boost-histogram's cast system to go from C++ class to the correct Python class. I can look (hopefully by end of day or tomorrow, as I'll be teaching soon).
Thanks for this very useful feature! I was wondering if this adds (or could add) support for renaming categorical axis values as well?
I still need to review this and make it work on callables.