cmssw update PackedCandidate::dzError to return what it should in track parameterization

in response to #45598

prior to this PR PackedCandidate saved had a mis-named covariance element dz instead of dsz

Details of all track parameters from TrackBase.h: https://github.com/cms-sw/cmssw/blob/fd00eb29f71ae9d94759a5bb2305cd63ff3f415a/DataFormats/TrackReco/interface/TrackBase.h#L19-L25

This PR updates the naming and corrects the returned value in PackedCandidate::dzError. For convenience, dszError is also kept available.

I checked dependencies using the short matrix tests and initially throwing in the PackedCandidate::dzError. This revealed dependencies in PATIsolatedTrackProducer, PackedCandidateTrackValidator, and ML b and tau taggers.

The tracking-related parts affected by the change:

PATIsolatedTrackProducer::produce and IsolatedTrack will now have a value saved in dzError matching the meaning.
- using the same short matrix check with a throw in IsolatedTrack::dzError I did not detect anything using this method in the matrix tests.
DQM/TrackingMonitor/src/PackedCandidateTrackValidator.cc : diffDszError which compares Track::dszError should still match the candidate by now using directly dszError and diffDzError should match as well

This PR keeps ML b and tau taggers behavior unchanged, to keep values consistent with the training. @cms-sw/btv-pog-l2 @cms-sw/tau-pog-l2

Jul 31 '24 23:07 slava77

looks like bot is asleep/broken

Aug 01 '24 00:08 slava77

looks like bot is asleep/broken

jenkins console is showing smth like:

Aug 01 '24 00:08 slava77

cms-bot internal usage

Aug 01 '24 06:08 cmsbuild

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45612/41112

There are other open Pull requests which might conflict with changes you have proposed:
- File DataFormats/PatCandidates/src/classes_def_objects.xml modified in PR(s): #38999

Aug 01 '24 06:08 cmsbuild

A new Pull Request was created by @slava77 for master.

It involves the following packages:

DQM/TrackingMonitor (dqm)
DataFormats/PatCandidates (reconstruction, xpog)
RecoBTag/FeatureTools (reconstruction)
RecoTauTag/RecoTau (reconstruction)

@antoniovagnerini, @cmsbuild, @ftorrresd, @hqucms, @jfernan2, @mandrenguyen, @nothingface0, @rvenditti, @syuvivida, @tjavaid, @vlimant can you please review it and eventually sign? Thanks. @AlexDeMoor, @JanFSchulte, @Ming-Yan, @Senphy, @VinInn, @VourMa, @andrzejnovak, @arossi83, @azotz, @castaned, @demuller, @fioriNTU, @gouskos, @gpetruc, @hatakeyamak, @hqucms, @idebruyn, @jandrea, @mbluj, @missirol, @mmusich, @mtosi, @rovere, @sroychow, @threus this is something you requested to watch as well. @antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

Aug 01 '24 06:08 cmsbuild

@cmsbuild please test

Aug 01 '24 08:08 swagata87

+1

Size: This PR adds an extra 92KB to repository Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-42d9d6/40709/summary.html COMMIT: 588caa3669279dd0d17297992bd36e01faa7201c CMSSW: CMSSW_14_1_X_2024-07-31-2300/el8_amd64_gcc12 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45612/40709/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially removed 1 lines from the logs
Reco comparison results: 169 differences found in the comparisons
DQMHistoTests: Total files compared: 46
DQMHistoTests: Total histograms compared: 3530500
DQMHistoTests: Total failures: 370
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3530110
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
Checked 200 log files, 170 edm output root files, 46 DQM output files
TriggerResults: no differences found

Aug 01 '24 12:08 cmsbuild

type bug-fix

Aug 01 '24 12:08 mmusich

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

Aug 01 '24 12:08 mbluj

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

isn't this what this is doing?

https://github.com/cms-sw/cmssw/pull/45612/files#diff-318e0c5f8a3329cecffc611f15d40f52e807c2888570a0ac7e8e13fe6504e102R782

Aug 01 '24 12:08 mmusich

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

isn't this what this is doing?

https://github.com/cms-sw/cmssw/pull/45612/files#diff-318e0c5f8a3329cecffc611f15d40f52e807c2888570a0ac7e8e13fe6504e102R782

Yes, indeed. I must overlooked it.

Aug 01 '24 12:08 mbluj

DQMHistoTests: Total failures: 370

Looking at 141.046_RunEGamma2023D , the only DQM difference is in the dzError comparisons between the original track and the packed cand:

this looks somewhat as expected now with a significant matching fraction compared to only accidental matches previously (IIUC, the "entries" in the stat box doesn't count the actual under/overflows displayed in the edge bins).

The agreement is not perfect due to the packing precision, although I can't exclude other issues. I picked a few cases in the new reco+mini files manually with higher eta and the agreement is fractionally good. Notably, the histogram filling (PackedCandidate::bestTrack() - reco::Track)/reco::Track in dzError() is made in range of [-0.05, 0.05] while here using the PackedCandidate directly it's ±0.001. So, outliers are expected .. or a broader range is needed (I hesitate to update it in this PR).

Aug 01 '24 13:08 slava77

+reconstruction Comparisons as expected: https://github.com/cms-sw/cmssw/pull/45612#issuecomment-2263103544

Aug 01 '24 14:08 mandrenguyen

@mandrenguyen @cms-sw/reconstruction-l2 @cms-sw/xpog-l2 and all, how far back do you want to backport? I can think of 10_6, 12_X (12_6 ? or more), 13_X (13_3 ? or more), 14_0

Aug 01 '24 15:08 slava77

@mandrenguyen @cms-sw/reconstruction-l2 @cms-sw/xpog-l2 and all, how far back do you want to backport? I can think of 10_6, 12_X (12_6 ? or more), 13_X (13_3 ? or more), 14_0

To give a non-answer I'd say as far back as whatever release we'd like to fix this for a future miniAOD campaign, and then every release in between, otherwise it will be confusing. Perhaps XPOG would have a more definite answer.

Aug 01 '24 15:08 mandrenguyen

for a backport I was thinking

in the PackedCandidate: only to add the dszError method and fix the dzError method; no update in the name of the PackedCovariance members. (to avoid the possible mess with class versions)
include everything else
- ML b and tau unchanged
- DQM and PATIsolatedTrackProducer updated

Note that the change in PackedCandidate::dzError will take effect immediately and for all available data, since it's the class method change, not a persisted value change.

Is this OK? I will make backports after this is merged.

Aug 01 '24 16:08 slava77

Given that we only plan to do the next re-Mini with CMSSW_15_X, probably there is no need to backport this? I am slightly worried that a backport would create more confusion than the benefits of correctness...

Aug 02 '24 14:08 hqucms

enable nano

Aug 02 '24 14:08 hqucms

please test

Aug 02 '24 14:08 hqucms

Given that we only plan to do the next re-Mini with CMSSW_15_X, probably there is no need to backport this? I am slightly worried that a backport would create more confusion than the benefits of correctness...

I'm not sure I follow why a re-Mini is the driving factor.

The main part of the PR is fixing the method that operates on already persisted data. I do additionally have a change in the data member name, but that's in a sense for clarity.

If the change in dzError method behavior is on the other hand the main concern, then this PR can not proceed regardless of re-Mini plans.

Aug 02 '24 14:08 slava77

If the change in dzError method behavior is on the other hand the main concern, then this PR can not proceed regardless of re-Mini plans.

Indeed that's the main concern. To me it is probably OK for a major release (15_X) which we will use for the next re-MINI campaign (where people would expect changes), but I am not sure if people would expect any changes like this between minor versions of existing releases.

I guess a safer way is to remove dzError completely and introduce a new name for the method with the correct behavior? But I don't know what to call it 😂

Aug 02 '24 15:08 hqucms

I guess a safer way is to remove dzError completely and introduce a new name for the method with the correct behavior? But I don't know what to call it 😂

The only practical way to be exposed is to throw an exception; simply removing will lead to the base class method Candidate::dzError { return 0; }

It sounds like your argument is that at some point a wrong implementation becomes a more acceptable truth and we are in this case.

Aug 02 '24 15:08 slava77

-1

Failed Tests: AddOn Size: This PR adds an extra 12KB to repository Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-42d9d6/40735/summary.html COMMIT: 588caa3669279dd0d17297992bd36e01faa7201c CMSSW: CMSSW_14_1_X_2024-08-02-1100/el8_amd64_gcc12 Additional Tests: NANO User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45612/40735/install.sh to create a dev area with all the needed externals and cmssw changes.

AddOn Tests

unknown

AddOnTest might have timed out: FAILED -  secs

Comparison Summary

Summary:

You potentially removed 3 lines from the logs
Reco comparison results: 142 differences found in the comparisons
DQMHistoTests: Total files compared: 46
DQMHistoTests: Total histograms compared: 3530500
DQMHistoTests: Total failures: 1928
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3528552
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
Checked 200 log files, 170 edm output root files, 46 DQM output files
TriggerResults: no differences found

NANO Comparison Summary

Summary:

You potentially removed 897 lines from the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 21
DQMHistoTests: Total histograms compared: 54913
DQMHistoTests: Total failures: 16
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 54897
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 20 files compared)
Checked 102 log files, 58 edm output root files, 21 DQM output files
TriggerResults: no differences found

Nano size comparison Summary:

Sample	kb/ev	ref kb/ev	diff kb/ev	ev/s/thd	ref ev/s/thd	diff rate	mem/thd	ref mem/thd
2500.001	2.785	2.785	0.000 ( +0.0% )	3.34	3.16	+5.6%	6.070	6.032
2500.002	2.898	2.898	0.000 ( +0.0% )	2.97	2.84	+4.6%	6.416	6.403
2500.003	2.844	2.844	0.000 ( +0.0% )	3.07	2.94	+4.2%	6.386	6.376
2500.011	1.447	1.447	0.000 ( +0.0% )	5.77	5.33	+8.2%	2.414	2.412
2500.012	1.907	1.907	0.000 ( +0.0% )	3.16	3.00	+5.6%	2.605	2.210
2500.013	1.762	1.762	0.000 ( +0.0% )	4.53	4.28	+5.8%	2.510	2.392
2500.021	0.022	0.022	0.000 ( +0.0% )	0.96	0.88	+9.5%	2.234	2.205
2500.022	0.022	0.022	0.000 ( +0.0% )	0.94	0.86	+9.3%	2.251	2.206
2500.023	0.022	0.022	0.000 ( +0.0% )	0.91	0.84	+8.1%	2.187	2.139
2500.024	0.022	0.022	0.000 ( +0.0% )	0.71	0.64	+10.5%	2.343	2.310
2500.031	0.035	0.035	0.000 ( +0.0% )	0.87	0.78	+11.2%	2.390	2.283
2500.032	0.036	0.036	0.000 ( +0.0% )	0.87	0.81	+7.1%	2.351	2.233
2500.033	0.037	0.037	0.000 ( +0.0% )	0.81	0.73	+11.5%	2.441	2.327
2500.034	0.036	0.036	0.000 ( +0.0% )	0.81	0.74	+9.6%	2.414	2.307
2500.101	2.643	2.643	0.000 ( +0.0% )	9.03	8.16	+10.6%	6.903	6.292
2500.111	1.327	1.327	0.000 ( +0.0% )	20.55	18.61	+10.4%	2.227	2.217
2500.112	1.732	1.732	0.000 ( +0.0% )	15.24	13.58	+12.2%	2.137	2.077
2500.131	5.194	5.194	0.000 ( +0.0% )	15.69	14.20	+10.5%	1.555	1.315
2500.201	2.475	2.475	0.000 ( +0.0% )	7.58	6.79	+11.6%	5.542	6.094
2500.211	1.589	1.589	0.000 ( +0.0% )	17.27	15.69	+10.1%	2.129	2.067
2500.212	2.030	2.030	0.000 ( +0.0% )	13.71	12.31	+11.4%	2.178	2.109
2500.221	2.006	2.006	0.000 ( +0.0% )	7.76	6.92	+12.1%	2.303	2.232
2500.222	3.072	3.072	0.000 ( +0.0% )	7.55	6.79	+11.3%	2.345	2.293
2500.223	8.886	8.885	0.002 ( +0.0% )	2.80	2.53	+10.7%	2.357	2.309
2500.224	5.512	5.512	0.000 ( +0.0% )	1.11	0.98	+13.4%	2.531	2.388
2500.225	5.530	5.530	0.000 ( +0.0% )	1.02	0.93	+9.3%	2.320	2.371
2500.226	2.970	2.970	0.000 ( +0.0% )	7.43	6.73	+10.5%	2.329	2.285
2500.227	8.972	8.972	0.000 ( +0.0% )	10.04	9.22	+8.9%	1.490	1.369
2500.231	1.407	1.407	0.000 ( +0.0% )	13.48	12.47	+8.1%	1.950	1.944
2500.232	2.145	2.145	0.000 ( +0.0% )	13.66	12.19	+12.1%	2.017	2.036
2500.233	4.669	4.668	0.001 ( +0.0% )	4.67	4.38	+6.7%	2.055	2.021
2500.234	3.304	3.304	0.000 ( +0.0% )	1.48	1.37	+7.9%	1.805	2.046
2500.235	3.315	3.315	0.000 ( +0.0% )	1.40	1.26	+10.7%	1.831	2.077
2500.236	2.082	2.082	0.000 ( +0.0% )	13.76	12.30	+11.9%	2.033	2.038
2500.237	7.977	7.977	0.000 ( +0.0% )	14.45	13.41	+7.8%	1.455	1.391
2500.241	9.405	9.405	0.000 ( +0.0% )	3.64	3.34	+8.8%	1.814	1.766
2500.242	10.331	10.331	0.000 ( +0.0% )	0.89	0.80	+11.0%	1.662	1.667
2500.243	2.712	2.712	0.000 ( +0.0% )	8.46	6.84	+23.8%	1.070	1.068
2500.244	485.976	485.976	0.000 ( +0.0% )	0.56	0.51	+9.9%	1.685	1.617
2500.245	823.224	823.224	0.000 ( +0.0% )	0.74	0.67	+10.2%	1.620	1.546
2500.901	1.749	1.749	0.000 ( +0.0% )	21.20	19.44	+9.0%	1.784	1.822
2500.902	1.598	1.598	0.000 ( +0.0% )	21.35	18.06	+18.3%	1.757	1.733
2500.911	13.931	13.931	0.000 ( +0.0% )	3.13	2.58	+21.3%	1.081	1.078
2500.912	0.171	0.199	-0.029 ( -14.5% )	1.56	0.96	+62.4%	0.972	0.968
2500.913	0.110	0.110	0.000 ( +0.0% )	1.11	1.14	-2.6%	0.975	0.970

Aug 02 '24 17:08 cmsbuild

@hqucms @cms-sw/xpog-l2 So, what is your preference? From the tracking point of view it may be simpler to remove the direct dz and dxy methods and redirect instead to the bestTrack

Aug 20 '24 15:08 slava77

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

Aug 21 '24 15:08 hqucms

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

should this new method be added to the reco::Candidate class?

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

Aug 21 '24 17:08 slava77

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

should this new method be added to the reco::Candidate class?

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate?

For me deltaZerror() looks good too.

Aug 22 '24 15:08 hqucms

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate?

For me deltaZerror() looks good too.

Uhm, I suspect that the design plans were to keep the code that works for objects deriving from RecoCandidate to work in the same way for PackedCandidate; hence the addition of dzError (and dxyError etc) in the common inheritance point in reco::Candidate. This PR now still follows that.

The proposal now discussed is instead to break this, right? The cost to keep it working would be an explicit dynamic cast at least around the dzError() calls in a user code; which I guess is acceptable considering the rest of CMSSW code base that already typically does that casting, e.g. PF vs packed.

Aug 22 '24 16:08 slava77

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate? For me deltaZerror() looks good too.

Uhm, I suspect that the design plans were to keep the code that works for objects deriving from RecoCandidate to work in the same way for PackedCandidate; hence the addition of dzError (and dxyError etc) in the common inheritance point in reco::Candidate. This PR now still follows that.

The proposal now discussed is instead to break this, right? The cost to keep it working would be an explicit dynamic cast at least around the dzError() calls in a user code; which I guess is acceptable considering the rest of CMSSW code base that already typically does that casting, e.g. PF vs packed.

Unfortunately yes... But given that the calls of dzError() identified in this PR all explicitly cast to PackedCandidate, I think it is probably fine.

Aug 22 '24 17:08 hqucms

@slava77 @cms-sw/xpog-l2 @cms-sw/orp-l2 Please remind us of the status of this PR in the meeting tomorrow.

Aug 26 '24 22:08 antoniovilela