pyhf.infer.intervals.upper_limits.upper_limit ignores the set confidence level when scan=None
Summary
The method pyhf.infer.intervals.upper_limits.upper_limit ignores the set confidence level via the level parameter if scan=None, but it always computes the upper limits at a 95% CL (level=0.05). This happens with pyhf 0.7.6 and very likely also with older versions (though I did not check them), and the bug is present in the main branch as well.
The reason is simple: here https://github.com/scikit-hep/pyhf/blob/main/src/pyhf/infer/intervals/upper_limits.py#L260 we don't pass the parameter level to the function toms748_scan. Passing **hypotest_kwargs is not enough for passing also level.
The effect is that the upper limits with scan=None (aka using the toms748 algorithm) are always computed using the default value level=0.05 set here https://github.com/scikit-hep/pyhf/blob/main/src/pyhf/infer/intervals/upper_limits.py#L26.
The following patch is enough to fix the main branch:
diff --git a/src/pyhf/infer/intervals/upper_limits.py b/src/pyhf/infer/intervals/upper_limits.py
index 6b86d58..047214f 100644
--- a/src/pyhf/infer/intervals/upper_limits.py
+++ b/src/pyhf/infer/intervals/upper_limits.py
@@ -262,6 +262,7 @@ def upper_limit(
model,
bounds[0],
bounds[1],
+ level,
from_upper_limit_fn=True,
**hypotest_kwargs,
)
OS / Environment
Irrelevant
Steps to Reproduce
Using pyhf 0.7.6:
import numpy as np
import pyhf
model = pyhf.simplemodels.uncorrelated_background(
signal=[12.0, 11.0], bkg=[50.0, 52.0], bkg_uncertainty=[3.0, 7.0]
)
observations = [51, 48]
data = pyhf.tensorlib.astensor(observations + model.config.auxdata)
scan = np.linspace(0, 5, 21)
# 95% C.L.
obs_limit, exp_limits, _ = pyhf.infer.intervals.upper_limits.upper_limit(
data, model, scan=None, return_results=True
)
print('toms748, 95%CL', obs_limit, exp_limits)
obs_limit, exp_limits, _ = pyhf.infer.intervals.upper_limits.upper_limit(
data, model, scan=scan, return_results=True
)
print('linear grid, 95%CL', obs_limit, exp_limits)
# 90% C.L.
obs_limit, exp_limits, _ = pyhf.infer.intervals.upper_limits.upper_limit(
data, model, scan=None, level=0.1, return_results=True
)
print('toms748, 90%CL', obs_limit, exp_limits)
obs_limit, exp_limits, _ = pyhf.infer.intervals.upper_limits.upper_limit(
data, model, scan=scan, level=0.1, return_results=True
)
print('linear grid, 90%CL', obs_limit, exp_limits)
File Upload (optional)
No response
Expected Results
Getting the 90% CL upper limits when scan=None as requested.
Actual Results
toms748, 95%CL 1.0115693894583466 [array(0.55988001), array(0.75702336), array(1.06234693), array(1.50116924), array(2.05078782)]
linear grid, 95%CL 1.0176408862843853 [array(0.59576921), array(0.76169166), array(1.08504773), array(1.50170482), array(2.06654952)]
toms748, 90%CL 1.0115693894583466 [array(0.55988001), array(0.75702336), array(1.06234693), array(1.50116924), array(2.05078782)]
linear grid, 90%CL 0.8635353695838466 [array(0.46812832), array(0.64311021), array(0.90865319), array(1.3133953), array(1.85105345)]
pyhf Version
pyhf, version 0.7.6
Code of Conduct
- [X] I agree to follow the Code of Conduct
Thanks for the bug report and diagnosis, @GiacomoXT. We'll want to go through and check to see if there is a historical reason for this that might also require an update to any of the lower bounds of our requirements.
I think all of the pyhf team is either currently traveling for NeurIPS 2024, or hit with some deadlines so this might not get looked at sufficiently until later this week. (You might notice that we're majorly behind on some CI health maintenance as well.)
Thanks @matthewfeickert for the quick feedback! And don't worry, take your time. ;)
No, this is just an oversight from #1274. This is a very nice catch. You'll notice this isn't as intended given that we do pass the level through in successive calls anyhow.