pisa PID stage should output maps that have PID categories as binning dims

Currently, PID stage outputs multiple maps, one for each PID category. This is probably a bug since DistributionMaker.get_outputs(return_sum=True) I expect will simply collapse all the maps into one, losing any PID information. (need to test this, though...)

I'd much prefer that the behavior of PID being a binning dimension is consistent across the codebase over modifying DistributionMaker.get_outputs(return_sum=True) (and any other current and future code that has to handle outputs from either a PID stage or a reco w/ binned-PID stage).

See e.g. how make_asymmetry_plots.py currently implements handling the case where there's a PID stage vs. reco stage that handles PID. (Note that this issue is therefore pertinent to PR #247 & Issue #158 but has implications well beyond that, too.)

Implementing this idea is tricky since translating from what can be very arbitrary PID criteria to a numerical value for binning dimension is fraught. Options for handling this are:

Actually try to translate from the PID criteria to numerical values. If this fails, either just return the maps separately (which doesn't help with compatibility) OR go to idea (2) below
Use arbitrary (but not nonsensical) numerical values to indicate bin #, which have no (necessary) correlation with the actual PID values. E.g.: [0,1) -> PID bin 0 (whatever that means); give this bin a bin_name, though, according to the first PID criteria label. [1,2) -> PID bin 1, ... , etc.

Idea (1) above will be difficult to implement since PID criteria for the PID stage can be pretty much completely arbitrary.

Idea (2) above will result in numerical binning that is incompatible with pipelines run with PID performed as a binning dimension in the reco stage. However, this forces compatibility with the assumptions made elsewhere in PISA and one pipeline's output using this type of PID stage will be compatible with another pipeline's logic using the same type of PID stage.

If there is further need to make an output of such a PID stage compatible with the output of a reco stage that also handles PID within it, that mapping can be done with more intelligent functions. Also, since both should have the same bin_names, this logic shouldn't be terribly difficult to implement.

After discussing this with @philippeller, he and I think idea (2) above is the best way to go. Please provide feedback, though, if idea (1) seems preferable or if there are modifications to the above or further ideas for handling this.

Nov 30 '16 19:11 jllanfranchi

@philippeller committed a fix directly to the cake branch, he tells me. So closing this.

Mar 28 '17 20:03 jllanfranchi

I could be wrong, but I think this was only done in the param service. I believe this to be the case since I only had to do a bugfix in the PISA 2 consistency script when testing the param service. Could you confirm @philippeller? If so, the same modification should probably be applied to the hist service for consistency.

Mar 28 '17 21:03 steven-j-wren

pisa pisa copied to clipboard

PID stage should output maps that have PID categories as binning dims

pisa
pisa copied to clipboard