pyhf
pyhf copied to clipboard
Validate shape of observations in workspace
Summary
There is currently no validation happening for the shape of the observations per channel when reading a workspace and extracting data from it. Making that part of the schema validation is probably difficult, but I think the pyhf.Workspace.data
could benefit from a shape check. Users will otherwise run into an error later when data is used, which might not be very easy to understand.
Additional Information
This example shows how a mis-specified data field in the observations goes through and then subsequently causes an exception
pyhf.exceptions.InvalidPdfData: eval failed as data has len 2 but 1 was expected
which is probably not very easy to understand for non-experts. I ran into this setup while manually editing a workspace for debugging purposes.
import pyhf
spec = {
"channels": [
{
"name": "SR",
"samples": [
{
"data": [15.0],
"modifiers": [
{"data": None, "name": "mu", "type": "normfactor"},
],
"name": "Signal",
}
],
}
],
"measurements": [
{"config": {"parameters": [], "poi": "mu"}, "name": "minimal_example"}
],
"observations": [{"data": [15.0, 20.0], "name": "SR"}],
"version": "1.0.0",
}
ws = pyhf.Workspace(spec)
model = ws.model()
data = ws.data(model) # perhaps an error should be raised at this point?
print(data) # [15.0, 20.0]
pyhf.infer.mle.fit(data, model) # this fails as the data has the wrong shape
Code of Conduct
- [X] I agree to follow the Code of Conduct