pyhf Improve error messages/hints when "Estimated distance to minimum too large."

Summary

When running fits, such as when doing toys, for example, one might run into a stack trace like so:

Signal-like:  73%|███████████████████████████████████████▎              | 2182/3000 [00:43<00:15, 51.47toy/s]ERROR:pyhf.optimize.mixins:     corr: None
      fun: 27.458922944973413
 hess_inv: None
  message: 'Optimization failed. Estimated distance to minimum too large.'
   minuit: <FMin algorithm='Migrad' edm=12615.800489965679 edm_goal=0.0002 errordef=1.0 fval=27.458922944973413 has_accurate_covar=True has_covariance=True has_made_posdef_covar=False has_parameters_at_limit=True has_posdef_covar=True has_reached_call_limit=False has_valid_parameters=False hesse_failed=False is_above_max_edm=True is_valid=False nfcn=132 ngrad=2 reduced_chi2=nan time=0.0035549210000027642>
(Param(number=0, name='totalError', value=0.11608174240322071, error=0.5087364149268865, merror=None, is_const=False, is_fixed=False, lower_limit=-5.0, upper_limit=5.0), Param(number=1, name='mu_SRSF_high_iMT2c', value=4.114644992884573, error=94.68506336071032, merror=None, is_const=False, is_fixed=False, lower_limit=0.0, upper_limit=100.0))
[[ 2.59711312e-01 -7.92231022e-01]
 [-7.92231022e-01  3.36374254e+05]]
     nfev: 132
     njev: 2
  success: False
      unc: None
        x: <ValueView totalError=0.11608174240322071 mu_SRSF_high_iMT2c=4.114644992884573>
Traceback (most recent call last):
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError
Traceback (most recent call last):
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_toys.py", line 11, in <module>
    n_samples = 3000
  File "/Users/kratsg/pyhf/src/pyhf/infer/__init__.py", line 171, in hypotest
    sig_plus_bkg_distribution, bkg_only_distribution = calc.distributions(poi_test)
  File "/Users/kratsg/pyhf/src/pyhf/infer/calculators.py", line 801, in distributions
    value = teststat_func(
  File "/Users/kratsg/pyhf/src/pyhf/infer/test_statistics.py", line 236, in qmu_tilde
    return _qmu_like(
  File "/Users/kratsg/pyhf/src/pyhf/infer/test_statistics.py", line 27, in _qmu_like
    tmu_like_stat, (mubhathat, muhatbhat) = _tmu_like(
  File "/Users/kratsg/pyhf/src/pyhf/infer/test_statistics.py", line 51, in _tmu_like
    muhatbhat, unconstrained_fit_lhood_val = fit(
  File "/Users/kratsg/pyhf/src/pyhf/infer/mle.py", line 132, in fit
    return opt.minimize(
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 194, in minimize
    result = self._internal_minimize(
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 67, in _internal_minimize
    raise exceptions.FailedMinimization(result)
pyhf.exceptions.FailedMinimization: Optimization failed. Estimated distance to minimum too large.

Certainly (and discussing with @alexander-held offline), the underlying issue which is a bit hard to parse is likely from this portion:

Param(number=1, name='mu_SRSF_high_iMT2c', value=4.114644992884573, error=94.68506336071032, merror=None, is_const=False, is_fixed=False, lower_limit=0.0, upper_limit=100.0)

where the parameter of interest $\mu \in [0,100]$ but we have the uncertainty such that $\hat{\mu} = 4 \pm 94.68$. Now, in this case, the Hessian is not used, so whatever comes during Migrad to calculate the uncertainty is what we rely on. It's not necessarily clear exactly how it fails, but the likely culprit is that we are just too close to the bounds to estimate the uncertainty correctly.

There is a nice presentation from @alexander-held on improving the $\tilde{q}_\mu$ definition which could definitely be considered (https://indico.cern.ch/event/1336439/contributions/5629384/attachments/2734750/4755122/20231016_qmutilde.pdf).

However, regardless of this, we might as well provide some parsing since we know estimated distance is too large, to try and determine which of the parameters is having issues with error estimation at the end of the day, to give a better hint automatically.

Additional Information

No response

Code of Conduct

[X] I agree to follow the Code of Conduct

Jul 05 '24 20:07 kratsg

simplified.json

Here's an example where one runs into this issue.

import pyhf
import json
import numpy as np

pyhf.set_backend("jax", pyhf.optimize.minuit_optimizer(verbose=0))

ws = pyhf.Workspace(json.load(open('simplified.json')))
model = ws.model()
data = ws.data(model)
n_samples = 3000

result = pyhf.infer.hypotest(
    1.0,
    data,
    model,
    init_pars=model.config.suggested_init(),
    par_bounds=model.config.suggested_bounds(),
    fixed_params=model.config.suggested_fixed(),
    calctype="toybased",
    return_expected_set=True,
    ntoys=n_samples,
)

This could maybe be improved with something like a better set of inits(?) but to first order, a more helpful or instructive error message would be nice.

Jul 05 '24 20:07 kratsg

See #2352.

Jul 05 '24 21:07 kratsg

Continuing on this thread -- ran with this code using the toy_handling_test branch

import pyhf
import json
import numpy as np

pyhf.set_backend("jax", pyhf.optimize.minuit_optimizer(verbose=0))

model = pyhf.simplemodels.correlated_background([1], [16.69251820997423], [17.87039285108563], [15.514643568862834])
data = [17.352095513005104] + model.config.auxdata
n_samples = 3000

mu_low = 2
mu_high = 17
steps = (mu_high-mu_low)+1
mu_tests = pyhf.tensorlib.astensor(np.linspace(mu_low, mu_high, steps))

bounds = [(-10.0, 10.0), (0, 10)]

obs_limit, exp_limit_and_bands, (poi_tests, tests) = pyhf.infer.intervals.upper_limits.upper_limit(data, model, scan=mu_tests, level=0.05, return_results=True, calctype="toybased", ntoys=n_samples, skip_failing_toys=True, par_bounds=bounds)

I added some lines of code to print out the failing parameters for various $\mu$ values: https://github.com/scikit-hep/pyhf/blob/89a5e54c33001ce832ae26108dbd39c4cbcd3cfb/src/pyhf/infer/calculators.py#L798-L835 . I see the following for $\mu = 2$:

signal failed for: [20. -0.12619636]

ERROR:pyhf.optimize.mixins:     corr: None
      fun: 6.755097351198641
 hess_inv: None
  message: 'Optimization failed. Estimated distance to minimum too large.'
   minuit: 
(Param(number=0, name='correlated_bkg_uncertainty', value=-0.008590994291151866, error=12.345295280596073, merror=None, is_const=False, is_fixed=False, lower_limit=-10.0, upper_limit=10.0), Param(number=1, name='mu', value=4.446755458374069, error=6.963598440174723, merror=None, is_const=False, is_fixed=False, lower_limit=0.0, upper_limit=10.0))
[[ 1.22807491e+05 -7.54394138e+01]
 [-7.54394138e+01  2.46939699e+01]]
     nfev: 118
     njev: 2
  success: False
      unc: None
        x: 
Traceback (most recent call last):
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError
signal failed for: [20.         -0.12619636]

background failed for: [14. -2.43142359]

Background-like:  40%|████████████▍                  | 1208/3000 [00:18
(Param(number=0, name='correlated_bkg_uncertainty', value=-2.4216473247403587, error=13.770772036856567, merror=None, is_const=False, is_fixed=False, lower_limit=-10.0, upper_limit=10.0), Param(number=1, name='mu', value=0.0529597061149767, error=5.528860725967594, merror=None, is_const=False, is_fixed=False, lower_limit=0.0, upper_limit=10.0))
[[  144.8300619  -3459.68901263]
 [-3459.68901263 82887.52075383]]
     nfev: 24
     njev: 2
  success: False
      unc: 
        x: 
Traceback (most recent call last):
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError
background failed for: [14.         -2.43142359]

background failed for: [13. -3.19383645]

Background-like:  92%|████████████████████████████▌  | 2760/3000 [00:41
(Param(number=0, name='correlated_bkg_uncertainty', value=-3.189730280970227, error=13.405134146519327, merror=None, is_const=False, is_fixed=False, lower_limit=-10.0, upper_limit=10.0), Param(number=1, name='mu', value=0.021148548763401442, error=7.268148882759533, merror=None, is_const=False, is_fixed=False, lower_limit=0.0, upper_limit=10.0))
[[  139.60901142 -3091.52888608]
 [-3091.52888608 68657.70002162]]
     nfev: 26
     njev: 2
  success: False
      unc: 
        x: 
Traceback (most recent call last):
  File "/Users/kratsg/pyhf/src/pyhf/optimize/mixins.py", line 64, in _internal_minimize
    assert result.success
AssertionError
background failed for: [13.         -3.19383645]

It's very much not clear if the problem ends up being that when we throw toys, we run into some $n$-sigma cases (for $n \geq 5$) where the bounds on our parameters are too restrictive and minuit just simply states that it is unable to actually converge or fit the data generated here. In this case, it seems that the background would need to fluctuate by 6-7 sigma in order to be able to fit appropriately to the data we're trying to fit to (which is the output of the pdf.sample()).

./cc @mhance

Jul 05 '24 23:07 kratsg