pyhf icon indicating copy to clipboard operation
pyhf copied to clipboard

Name validity checks in pyhf json2xml for ROOT compatibility

Open alexander-held opened this issue 4 years ago • 3 comments

Description

pyhf supports names for channels, samples, and modifiers which would cause segmentation faults when used in ROOT. When building a workspace in pyhf, then running pyhf json2xml and finally hist2workspace, the segfault will happen in hist2workspace. Arguably a clear error message in hist2workspace would help, but alternatively an extra layer of protections in pyhf json2xml could be nice to catch problematic workspaces early.

Types of problematic names I have encountered:

  • whitespace in channel name
  • whitespace in sample names
  • whitespace in modifier name

Is your feature request related to a problem? Please describe.

Can simplify debugging workspaces built for pyhf but crashing in ROOT.

Describe the solution you'd like

Name checks in pyhf json2xml to ensure ROOT compatibility.

Describe alternatives you've considered

Better error messages in ROOT, not just hist2workspace is affected but also MakeModelAndMeasurementFast.

Relevant Issues and Pull Requests

none

Additional context

The crashes in ROOT were tested with 6.20.06, I expect the same behavior for earlier versions but did not verify that.

alexander-held avatar Jan 20 '21 10:01 alexander-held

So essentially /[a-zA-Z-_][a-zA-Z0-9-_]+/ is what is allowed? I wonder how this should be handled. This would only happen in the case if someone makes a workspace by hand so perhaps we should just add a check in json2xml that just halts the execution? Alternatively, just raise tons of warnings instead and hope the user notices.

kratsg avatar Jan 20 '21 12:01 kratsg

I think the strings for channel/sample may possibly be allowed to begin with a number too, but not sure there. For modifiers I believe numbers cause issues, but I don't have a reference. Adding something to json2xml seems useful, since that should catch the issue when it matters (users in pure pyhf do not need to worry using whitespaces when e.g. renaming modifiers, so catching it earlier seems to be counterproductive).

alexander-held avatar Jan 20 '21 12:01 alexander-held

related: https://github.com/scikit-hep/pyhf/issues/179#issuecomment-1073751988

alexander-held avatar Mar 21 '22 11:03 alexander-held