hist icon indicating copy to clipboard operation
hist copied to clipboard

[DOCS] Syntactic shortcut for dumping histogram centers/counts

Open kratsg opened this issue 3 years ago • 2 comments

Describe the problem, if any, that your feature request is related to

It would be nice to generate a "table" of the values shown in some particular histogram. Consider for example, a 1-dimensional histogram:

h = hist.Hist(hist.axis.IntCategory([], growth=True))
for array in uproot.iterate(...):
    h.fill(array["value"])

It would be nice to dump a dictionary directly where the keys are the category label and the values are the counts. Since list(h.axes[0]) already gives you the categories (which is not necessarily obviously), one would expect

dict(h)

to JustWork"™. The closest equivalent I've found is to do

>>> dict(zip(h.axes[0], h.values()))
{3: 248352.0, 2: 208653.0, 1000024: 1994.0, 1000014: 38226.0, 1000012: 59.0, 1000011: 3.0}

but that's not necessarily as nice.

Describe the feature you'd like

I would expect dict(h) to work. And for multi-dimensional arrays, the keys could be hashable tuples instead.

Describe alternatives, if any, you've considered

See above.

kratsg avatar Jan 06 '22 15:01 kratsg

We generally do not change API for 1D and ND histograms. This would introduce such a change - dict(h) would need to produce {(3,): 248352.0, ... and so forth. Also, this usually would produce tuples of floats (all axes types except Integer, IntCategory, and StrCategory), which are really bad for hashing.

dict(zip(h.axes[0], h.values())) is explicit, and not that bad.

henryiii avatar Jan 06 '22 21:01 henryiii

This is an example that I would typically recommend to provide somewhere in the docs/notebooks ... My 2 cents.

eduardo-rodrigues avatar Jan 07 '22 08:01 eduardo-rodrigues