pyhf Staterror issues with partially fixed bins due to to zero yields

Summary

The changes to staterror handling after 0.6.3 can result in some crashes due to unsupported treatment of partially fixed parameters (fixed modifiers in some bins but not in others). This results in https://github.com/scikit-hep/pyhf/blob/6f8d87e65e2b9b6033974ef18cbb4d9bf62f3dd8/src/pyhf/parameters/paramsets.py#L44-L46 getting raised.

cc @ekourlit

OS / Environment

n/a

Steps to Reproduce

import pyhf

# zero nominal yield, non-zero staterror
spec = {
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [0.0, 1.0],
                    "modifiers": [
                        {
                            "data": [0.1, 0.1],
                            "name": "staterror_SR",
                            "type": "staterror"
                        }
                    ],
                    "name": "sample_1"
                }
            ]
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": ""}, "name": "measurement"}],
    "observations": [{"data": [0.0, 1.0], "name": "SR"}],
    "version": "1.0.0"
}
model = pyhf.Workspace(spec).model()
for parameter in model.config.par_order:
    try:
        print(parameter, model.config.param_set(parameter).suggested_fixed_as_bool)
    except RuntimeError as e:
        print(e)


# non-zero nominal yield, zero staterror
spec = {
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [0.1, 1.0],
                    "modifiers": [
                        {
                            "data": [0.0, 0.1],
                            "name": "staterror_SR",
                            "type": "staterror"
                        }
                    ],
                    "name": "sample_1"
                }
            ]
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": ""}, "name": "measurement"}],
    "observations": [{"data": [0.0, 1.0], "name": "SR"}],
    "version": "1.0.0"
}
model = pyhf.Workspace(spec).model()
for parameter in model.config.par_order:
    try:
        print(parameter, model.config.param_set(parameter).suggested_fixed_as_bool)
    except RuntimeError as e:
        print(e)


# zero nominal yield, zero staterror
spec = {
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [0.0, 1.0],
                    "modifiers": [
                        {
                            "data": [0.0, 0.1],
                            "name": "staterror_SR",
                            "type": "staterror"
                        }
                    ],
                    "name": "sample_1"
                }
            ]
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": ""}, "name": "measurement"}],
    "observations": [{"data": [0.0, 1.0], "name": "SR"}],
    "version": "1.0.0"
}
model = pyhf.Workspace(spec).model()
for parameter in model.config.par_order:
    try:
        print(parameter, model.config.param_set(parameter).suggested_fixed_as_bool)
    except RuntimeError as e:
        print(e)


# example that works: non-zero data and zero staterror for one sample,
# zero data and non-zero staterror for another sample
spec = {
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [0.1, 1.0],
                    "modifiers": [
                        {
                            "data": [0.0, 0.1],
                            "name": "staterror_SR",
                            "type": "staterror"
                        }
                    ],
                    "name": "sample_1"
                },
                {
                    "data": [0.0, 1.0],
                    "modifiers": [
                        {
                            "data": [0.1, 0.1],
                            "name": "staterror_SR",
                            "type": "staterror"
                        }
                    ],
                    "name": "sample_2"
                }
            ]
        }
    ],
    "measurements": [{"config": {"parameters": [], "poi": ""}, "name": "measurement"}],
    "observations": [{"data": [0.0, 1.0], "name": "SR"}],
    "version": "1.0.0"
}
model = pyhf.Workspace(spec).model()
for parameter in model.config.par_order:
    try:
        print(parameter, model.config.param_set(parameter).suggested_fixed_as_bool)
    except RuntimeError as e:
        print(e)

File Upload (optional)

No response

Expected Results

The script above should run without running into any errors.

Actual Results

[True, False] is neither all-True nor all-False, so not compressible
[True, False] is neither all-True nor all-False, so not compressible
[True, False] is neither all-True nor all-False, so not compressible
staterror_SR False

pyhf Version

pyhf, version 0.7.0rc2.dev18

Code of Conduct

[X] I agree to follow the Code of Conduct

Aug 12 '22 21:08 alexander-held

I should keep this in mind when I work on fixing Issue #1720 for the v0.7.0rc2.

Aug 12 '22 21:08 matthewfeickert

I was wondering whether this is related to that issue. It seems a bit different to me, as I understand why it would make sense to set modifiers for bins with nominal yield or staterror being zero to constant. It may be that the handling for such cases just needs to be supported explicitly.

Aug 12 '22 21:08 alexander-held

I was wondering whether this is related to that issue.

I'm pretty sure it is different. As we clearly have problems with staterror I'm more just trying to loosely connect them together for easier reference. I'll just make an Issue for that now though and do it properly.

Aug 12 '22 21:08 matthewfeickert

@alexander-held i'm trying to figure out what you actually expect to come out of that function.

>>> model.config.param_set(parameter).suggested_fixed
[True, False]

is giving you what I would expect. suggested_fixed_as_bool()->bool returns a single boolean for the whole list, which could be treated as an np.all() result perhaps ("true if all true, else false"). But this is (I think) correctly undefined behavior. The current behavior is:

"if all True, return True"
"elif all False, return False"
"else raise error"

Aug 27 '22 15:08 kratsg

@kratsg I agree, I have thought about this a bit more and there is nothing else I can think of which this API should do in addition in such cases. I ran into this as I have been using suggested_fixed_as_bool in cabinetry. That worked fine previously since this case of mixed constant / floating parameters was not supported by pyhf. The solution for the issue I ran into is to switch to the correct API in cabinetry, which is suggested_fixed.

Feel free to close, the current behavior makes sense to me now.

Aug 27 '22 19:08 alexander-held

Closing this as "won't fix".

Aug 27 '22 20:08 kratsg

pyhf pyhf copied to clipboard

Staterror issues with partially fixed bins due to to zero yields

Summary

OS / Environment

Steps to Reproduce

File Upload (optional)

Expected Results

Actual Results

pyhf Version

Code of Conduct

pyhf
pyhf copied to clipboard