pyhf icon indicating copy to clipboard operation
pyhf copied to clipboard

docs: learn nb on covariance matrices

Open lukasheinrich opened this issue 4 years ago • 8 comments

Description

spent some more time with MINUIT code to see how it does the covariance calculation and putting the lessons into a notebook. What I was initially confused by is how it recomputes the full hessian only by transforming the hessian in the internal coordinates. Turns out it uses a property that only holds at the minimum

This is just to document some of these things in order to make full covariances available in non-minuit backends @alexander-held

lukasheinrich avatar Dec 21 '20 16:12 lukasheinrich

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Codecov Report

Merging #1233 (5ef0781) into master (fd7930c) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1233   +/-   ##
=======================================
  Coverage   97.45%   97.45%           
=======================================
  Files          63       63           
  Lines        3700     3700           
  Branches      524      524           
=======================================
  Hits         3606     3606           
  Misses         55       55           
  Partials       39       39           
Flag Coverage Δ
unittests 97.45% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fd7930c...50beab0. Read the comment docs.

codecov[bot] avatar Dec 21 '20 16:12 codecov[bot]

Added a second notebook on errors (the calc in #764 ) and we see compatible errors with various versions of how to calculate them.

maybe a question to @alexander-held: is it important to your use-case to have MINUIT-like uncertainties or would uncertainties based on the standard covariance matrix (the blue dashed line) suffice? Separately from this we can implement MINOS-like uncert by doing a line search.

image

lukasheinrich avatar Dec 22 '20 15:12 lukasheinrich

fwiw, this reproduces the errors in the 1Lbb lhood. (though the MINUIT specifics seem to not matter)

image

lukasheinrich avatar Dec 22 '20 15:12 lukasheinrich

In the minuit_errors.ipynb notebook, the objective is 2 NLL, but errordef=0.5 - I think errordef=1 is needed if the objective is not the NLL.

alexander-held avatar Dec 22 '20 16:12 alexander-held

I did not understand the last entry in the legend in the plots above, which seemed to have a completely different behavior. For me all the implementations look much more similar when using this notebook (and fixing index in the plotting, so the second parameter is shown): uncs

I thought the discrepancy might be due to boundary effects, since the POI is limited to [0, 10] and hits both ends of this boundary (at 50 and 70 observed events).

I tried out

pdf = pyhf.simplemodels.hepdata_like([10.0], [500.0], [5.0])
scan = jnp.linspace(515, 545, 10)

which is further away from boundaries, but am surprised there is still a substantial difference: uncs_2

Increasing all yields by a factor 10 (with constant background uncertainty) makes the discrepancy smaller, and with a factor 100 it slowly disappears.

Is the 1Lbb statistically limited? Maybe this is some systematics effect that causes differences? In any case I need to understand that other notebook better, since I did not expect such large differences.

alexander-held avatar Dec 22 '20 17:12 alexander-held

the last entry (brown) is using minuit but supplying the AD gradient. This uses strategy one, so perhaps it's stoppinng eariler and therefore the underlying hesssian is different.

lukasheinrich avatar Dec 22 '20 17:12 lukasheinrich

Thanks, the strategy was the difference between our setups. I was running with 0.5.4 which does not yet include the switch to strategy 0 if gradients are provided. I am surprised that the difference is this large. I thought the HESSE call would help, but maybe the convergence fails completely? Strange since this likelihood is relatively simple.

alexander-held avatar Dec 22 '20 17:12 alexander-held