Histogram with WeightedMean storage returns wrong sum_of_weights_squared
I want to create histograms and be able to access their sum of weights squared. When using WeightedMean storage sum_of_weights_squared just returns the number of entries, not the sum of weights squared. The same issue is true for sum_of_weights (it returns the counts instead again), but this is a smaller issue for me.
I could in principle retrieve the correct sum of weights squared if I used accumulators instead of histograms. However, for the purpose of my data analysis, this would slow down the code a lot and I would need to replicate the large nested structure of the histograms into accumulators. So I would much prefer to just use histograms, if this bug can be fixed.
To test:
import boost_histogram as bh
h = bh.Histogram(bh.axis.Regular(1, 0, 2), storage=bh.storage.WeightedMean()) # Double() is the default
h.fill([1]*3, sample=[2]*3)
h.view().sum_of_weights_squared
The last line returns
array([3.])
while the sum of weights squared is actually 12.
I am using python 3.8.
Attaching a screenshot of my notebook.
That's odd. @henryiii ?
Using https://pyodide.org/en/stable/console.html because it's handy:
(Edit: chopped off the answer by mistake)
Adding the copy-pasteable code from Henry's answer, the weight argument was missing from the fill command:
import boost_histogram as bh
h = bh.Histogram(bh.axis.Regular(1, 0, 2), storage=bh.storage.WeightedMean())
h.fill([1]*3, weight=2, sample=[2]*3) # note use of weight here
# Histogram(Regular(1, 0, 2), storage=WeightedMean()) # Sum: WeightedMean(sum_of_weights=6, sum_of_weights_squared=12, value=2, variance=0)
h.view().sum_of_weights_squared
# array([12.])
@olbessid if this is clear can the issue get closed?