Hypotest returning nan for CLs observed

Open backmanification opened this issue 3 years ago • 1 comments

Summary

When I try to run hypotest nan is returned for CLs obs. I am running over a large set of sample points in a grid with many dimensions. These points are expected to return rather high CLs values, but instead they return nan.

OS / Environment

Software:

    System Software Overview:

      System Version: macOS 12.1 (21C52)
      Kernel Version: Darwin 21.2.0
      Time since boot: 9 days 2:33

Steps to Reproduce

source setup.sh
python min_pyhf.py --dsid 448047

File Upload (optional)

pyhf_negCLs.zip

Expected Results

I expected to get numbers for all CLs values or an error message, but I just get nan values for the observed.

Actual Results

Starting pyhf job, with the following scipy and pyhf versions
1.8.0
0.6.3
Python path: /usr/local/Cellar/root/6.22.08/lib/root
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
CLs obs: nan
CLs d2s: 1.0
CLs d1s: 1.0
CLs exp: 1.0
CLs u1s: 1.0
CLs u2s: 1.0

pyhf Version

pyhf, version 0.6.3

Code of Conduct

[X] I agree to follow the Code of Conduct

May 02 '22 07:05 backmanification

I think nan will happen when your CLs is expected to be super high. But also I wanted to point out that your min_pyhf.py script is just repeating functionality that exists via the pyhf cls command line API.

python min_pyhf.py --dsid 448047

is equivalent to

pyhf cls --backend jax 448047.json

which gives

$ pyhf cls --backend jax 448047.json 
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
{
    "CLs_exp": [
        1.0,
        1.0,
        1.0,
        1.0,
        1.0
    ],
    "CLs_obs": NaN
}

apart from that , a quick look at the workspace indicates low stats, but also some weird things going on like the jer/jes uncertainties being zeros

            {
              "data": {
                "hi_data": [
                  0,
                  0
                ],
                "lo_data": [
                  0,
                  0
                ]
              },
              "name": "jes_uncerts",
              "type": "histosys"
            },
            {
              "data": {
                "hi_data": [
                  0,
                  0
                ],
                "lo_data": [
                  0,
                  0
                ]
              },
              "name": "jer_uncerts",
              "type": "histosys"
            }
          ],
          "name": "bsm_signal"
        },

and that channel1 has zero expected data events

      "name": "channel1",
      "samples": [
        {
          "data": [
            0,
            0
          ],

Is this really expected? We can't do much in pyhf to protect against workspaces that seem to be weirdly designed (and it's not quite in our scope).

May 02 '22 16:05 kratsg