mplhep icon indicating copy to clipboard operation
mplhep copied to clipboard

Histplot errorbars

Open meisonlikesicecream opened this issue 2 years ago • 11 comments

I find the documentation of the histplot function [0] slightly confusing/contradictive. I stumbled upon this when I wanted to plot a simple weighted 1D histogram with sqrt(w2) errorbars (which I assumed was the default, but apparently not). I have not checked if the following issues are true for histtypes other than histtype=‘errorbar’.

This line about the yerr parameter "Following modes are supported: - True, sqrt(N) errors or poissonian interval when w2 is specified" [1] makes it sound like if yerr == True, then sqrt(N) is used for the errors if w2 is NOT specified, and that the poissonian interval is used if w2 IS specified. However, this is not the case: it always does the poissonian interval even if w2 is None because in the Plottable class definition it always initialises with the method “poisson” [2], which therefore, will always run the poissonian interval [3]. If w2 is specified, and yerr!=None, then it crashes because of [4].

This leads me to another confusion: which error calculation do we want for weighted histograms? The following line about the w2 parameter "Sum of the histogram weights squared for poissonian interval error calculation" [5] makes it sound like we always want to use the poissonian error for weighted histograms, while this line "If w2 has integer values (likely to be data) poisson interval is calculated, otherwise the resulting error is symmetric sqrt(w2)" [6] makes it sound like the poisson interval should only be used for integer values. Again, "sqrt" is never used if w2method is None because of [3], even if w2 has integer values.

Also, if you specify to use the “sqrt” method, it never uses the variances (i.e. w2) for calculating the errors [7]...

I'm happy submit a PR if someone can explain what the intended usage is :-)

  1. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/plot.py#L57
  2. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/plot.py#L95
  3. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/utils.py#L154
  4. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/utils.py#L183
  5. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/plot.py#L280
  6. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/plot.py#L99
  7. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/plot.py#LL104C12-L106C89
  8. https://github.com/scikit-hep/mplhep/blob/master/src/mplhep/utils.py#L193-L194

meisonlikesicecream avatar May 16 '23 12:05 meisonlikesicecream

For sure the sqrt method should be using variances (sumw2), so that's a bug. As for the rest of the plumbing, there is some attempt to do scaled Poisson in https://github.com/scikit-hep/mplhep/blob/9b47e92944f1d95617b2409d18e1f540edbb77d8/src/mplhep/error_estimation.py#L11-L31 for the case where sumw2 is available. So in practice, it will be very close to sqrt(N) for reasonably large N. Nevertheless, one may prefer to be using exactly sqrt(sumw2) the documentation and the code should be synchronized, and I'm sure the maintainers would be happy to receive a PR to correct this.

In the meantime, you can pass a custom error function to mplhep.histplot(..., w2method=mymethod) where

def mymethod(sumw, sumw2):
    return np.array([sumw - np.sqrt(sumw2), sumw + np.sqrt(sumw2)])

nsmith- avatar Aug 02 '23 13:08 nsmith-

I came across this issue while trying to do another task with the histplot errorbars. Basically, my goal is to not draw the error bars for the bins without any content. Usually, this isn't a huge issue since the bins with content dwarf the bins without content and therefore the error bars on the bins without content do not rise above the x-axis into the figure; however, in the case where I am plotting on a log scale and the bin content traverses many orders of magnitude, the bins without content do show up in the plot.

This problem can be replicated with a short hist creation.

h = hist.Hist.new.Reg(10,0,1).Weight()
h.fill(
    [0., 0.5, 0.6, 0.7, 0.8, 0.9],
    weight = [3., 1e3, 1e4, 1e5, 1e6, 1e7]
)

Now with h defined, we can look at various methods of plotting the error bars.

h.plot()
plt.yscale('log')

eg-bin-errbars

We can mask out the error bars by setting the bins with a nan lo poisson estimate to 0.

def poisson_interval_ignore_empty(sumw, sumw2):
    interval = mplhep.error_estimation.poisson_interval(sumw, sumw2)
    lo, hi = interval[0,...], interval[1,...]
    to_ignore = np.isnan(lo)
    lo[to_ignore] = 0.0
    hi[to_ignore] = 0.0
    return np.array([lo,hi])
mplhep.histplot(h.values(), bins=h.axes[0].edges, w2method = poisson_interval_ignore_empty, w2 = h.variances())
plt.yscale('log')

eg-ignore-empty

The process of developing this solution actually showed me that the current function does not actually set nan to zero. You should use np.isnan(interval) rather than interval == np.nan in the line below. Maybe this tip is just an addition to the docs or maybe it can included as a named method?

https://github.com/scikit-hep/mplhep/blob/566f8bfa6ae129675a1861613d6955d8a520af5b/src/mplhep/error_estimation.py#L53C1-L53C77

tomeichlersmith avatar Nov 28 '23 17:11 tomeichlersmith

It might make sense as an additional method, but it's maybe not trivial if there is 0 uncertainty for 0 events yield in poison stats cf. https://stats.stackexchange.com/questions/427019/confidence-interval-for-mean-of-poisson-with-only-zero-counts

andrzejnovak avatar Nov 29 '23 13:11 andrzejnovak

We also stumbled upon this issue in our plots after some debugging of some weird looking y errors. Is there any deeper reason why the fix would just involved changing the np.sqrt(values) to np.sqrt(variances) in

https://github.com/scikit-hep/mplhep/blob/7fd363d33a154101797c16289ca4f76b8241014c/src/mplhep/utils.py#L477-L478

? (and changing _ to variances obviously)

I guess for applications like plotting data, variances should be identical to values so using the above change would be more often correct than not / currently.

riga avatar Mar 19 '25 07:03 riga

Thanks for reminding me of this @riga I think it got forgotten while hoping for a PR from the OP. The fix should be fine, the existence of the method is basically historical, as a fallback for "simple" histograms like just passing histplot([1,2,3,2,1], ...) but there's no real reason now I think to no assume w2=w for these.

I am curious, can you share your use-case? Because I would assume that the overwhelming majority of "I just want it to work" use-cases want "poisson" and for the "I know what errors I want" there is w2method

andrzejnovak avatar Mar 19 '25 10:03 andrzejnovak

It's actually a more general use case and maybe even rather related to hist.

We are using histogram.plot1d to create our 1D histograms, which is actually just a thin wrapper that instantly invokes mplhep.histplot:

https://github.com/scikit-hep/hist/blob/ce6e5996cb6de4d1e331dff4a911aaf5d4c8e114/src/hist/basehist.py#L538

I would have assumed that, if a weight storage is present, variances and the w2method are set for the histplot call to produce errors based on sqrt(variances). At least that's my (obviously biased and subjective) expectation.

Right now, histplot (or rather the default w2method resolution within Plottable) falls back to poisson even if variances are given. Perhaps it's not too much about what the default behavior in mplhep should be, but rather what information is forwarded from a histogram.


Some background on where we saw an issue: we tried plotting a simple histogram with bin values around 0.1 and variances around 0.0001 and then found

Image

where two of the last three bins show exceedingly high y-errors. For some reason (maybe because the bin values are almost - but not exactly - zero there), those bins (the others maybe as well?) fallback to the (asym.) poisson treatment with an uncertainty of e.g. ~1.8 on a value of 0, which does not match the variance values in these bins.

riga avatar Mar 19 '25 11:03 riga

Gotcha thanks for the clarification. This helps. So the easy issue I see are:

  • not setting rtol in https://github.com/scikit-hep/mplhep/blob/main/src/mplhep/utils.py#L469
  • using sqrt(w) as opposed to sqrt(w2), simple bugfix

The less obvious to me are:

  • yerr behaviour for 0/isclose(0) values, plot or not? We could also plot if "within" a distribution eg [1,0,2] and not if at edges eg [1,3,4,0,0,0] but that feels a bit too much magic
  • What is the correct default for non-integer histograms? Is it actually sqrt(w2)?

Btw I haven't checked, but I assume hist_obj.plot1d() would take w2method correctly.

andrzejnovak avatar Mar 19 '25 13:03 andrzejnovak

Unfortunately, one of the reasons this keeps coming up is because w2method is not being processed correctly although it doesn't appear to be an issue with how hist is calling mplhep.histplot and instead how mplhep.histplot is using the information that is passed in.

Maybe I'm using it wrong?

Python Tests Showing w2method Not Being Called
import hist
import mplhep
import mplhep.error_estimation
import matplotlib.pyplot as plt
import unittest

class w2method_called:
    def __init__(self):
        self.called = False

    def __call__(self, w, w2):
        self.called = True
        return mplhep.error_estimation.poisson_interval(w, w2)


def get_weighted_hist():
    return (
        hist.Hist.new
        .Reg(10,0,1)
        .Weight()
    ).fill(
        [0., 0.5, 0.6, 0.7, 0.8, 0.9],
        weight = [3., 1e3, 1e4, 1e5, 1e6, 1e7]
    )


class TestW2Method(unittest.TestCase):
    def check_plot_method(self, f):
        w2method = w2method_called()
        f(get_weighted_hist(), w2method)
        self.assertTrue(w2method.called)
        plt.clf()

    def test_hist_Hist_plot(self):
        self.check_plot_method(
            lambda h, w2method: h.plot(w2method = w2method)
        )

    def test_histplot_Hist(self):
        self.check_plot_method(
            lambda h, w2method: mplhep.histplot(h, w2method = w2method)
        )

    def test_histplot_vals_bins_variances(self):
        self.check_plot_method(
            lambda h, w2method: mplhep.histplot(
                h.values(), bins = h.axes[0].edges, w2 = h.variances(),
                w2method = w2method
            )
        )

if __name__ == '__main__':
    unittest.main()

Output:

(venv) tom@appa:~/code/histplot-errbars$ python3 w2method.py 
FF.
======================================================================
FAIL: test_hist_Hist_plot (__main__.TestW2Method.test_hist_Hist_plot)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tom/code/histplot-errbars/w2method.py", line 35, in test_hist_Hist_plot
    self.check_plot_method(
  File "/home/tom/code/histplot-errbars/w2method.py", line 31, in check_plot_method
    self.assertTrue(w2method.called)
AssertionError: False is not true

======================================================================
FAIL: test_histplot_Hist (__main__.TestW2Method.test_histplot_Hist)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tom/code/histplot-errbars/w2method.py", line 40, in test_histplot_Hist
    self.check_plot_method(
  File "/home/tom/code/histplot-errbars/w2method.py", line 31, in check_plot_method
    self.assertTrue(w2method.called)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 3 tests in 0.049s

FAILED (failures=2)

Environment:

(venv) tom@appa:~/code/histplot-errbars$ pip freeze hist mplhep
boost-histogram==1.5.1
click==8.1.8
contourpy==1.3.1
cycler==0.12.1
fonttools==4.56.0
hist==2.8.0
histoprint==2.6.0
kiwisolver==1.4.8
matplotlib==3.10.1
mplhep==0.3.58
mplhep_data==0.0.4
numpy==2.2.4
packaging==24.2
pillow==11.1.0
pyparsing==3.2.1
python-dateutil==2.9.0.post0
scipy==1.15.2
six==1.17.0
uhi==0.5.0

tomeichlersmith avatar Mar 19 '25 14:03 tomeichlersmith

@riga is what you see perhaps coming from the "scaled poisson" Garwood interval?

When a bin is zero, the scale of the nearest nonzero bin is substituted to scale the nominal upper bound.

This is coming from the assumption that if a bin is zero then its corresponding sumw2 will also be zero. Perhaps that is too strong of an assumption?

nsmith- avatar Mar 20 '25 15:03 nsmith-

Ok, so

  • using sqrt(w) as opposed to sqrt(w2), simple bugfix

and

Python Tests Showing w2method Not Being Called

should be fixed here https://github.com/scikit-hep/mplhep/pull/558

In the meantime I will leave this issue open in case we want to continue the discussion about sensible defaults for examples such as the one brough by @riga. Here's a short summary of the current behaviour

Code
import mplhep as hep
import matplotlib.pyplot as plt
import numpy as np
import hist

fig, ax = plt.subplots()
np.random.seed(0)
evts = np.random.normal(2,2,100)
weights = np.random.uniform(0.99, 1.01, 100)
htype1 = hist.new.Reg(5, 0, 5).Weight().fill(evts)
htype1.plot(ax=ax, histtype='errorbar', capsize=4, label='Nominal')
htype2 = htype1.copy()
htype1.view().value[2] = 0.5
htype1.view().variance[2] = 10
htype1.plot(ax=ax, histtype='errorbar', capsize=4, label='Int w2')
htype2.view().value[3] = 0.5
htype2.view().variance[3] = 100.01
htype2.plot(ax=ax, histtype='errorbar', capsize=4, label='Non int w2')
ax.legend()

Image

andrzejnovak avatar Mar 21 '25 12:03 andrzejnovak

@riga is what you see perhaps coming from the "scaled poisson" Garwood interval?

This is coming from the assumption that if a bin is zero then its corresponding sumw2 will also be zero. Perhaps that is too strong of an assumption?

Sorry for the delay. Looking again at what happens when a bin is zero, I think so, yes (to both). I guess with #558 this should be better handled now, and it's indeed just a matter of what is a sensible default.

riga avatar Apr 18 '25 11:04 riga