cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

update PackedCandidate::dzError to return what it should in track parameterization

Open slava77 opened this issue 1 year ago • 33 comments

in response to #45598

prior to this PR PackedCandidate saved had a mis-named covariance element dz instead of dsz

Details of all track parameters from TrackBase.h: https://github.com/cms-sw/cmssw/blob/fd00eb29f71ae9d94759a5bb2305cd63ff3f415a/DataFormats/TrackReco/interface/TrackBase.h#L19-L25

This PR updates the naming and corrects the returned value in PackedCandidate::dzError. For convenience, dszError is also kept available.

I checked dependencies using the short matrix tests and initially throwing in the PackedCandidate::dzError. This revealed dependencies in PATIsolatedTrackProducer, PackedCandidateTrackValidator, and ML b and tau taggers.

The tracking-related parts affected by the change:

  • PATIsolatedTrackProducer::produce and IsolatedTrack will now have a value saved in dzError matching the meaning.
    • using the same short matrix check with a throw in IsolatedTrack::dzError I did not detect anything using this method in the matrix tests.
  • DQM/TrackingMonitor/src/PackedCandidateTrackValidator.cc : diffDszError which compares Track::dszError should still match the candidate by now using directly dszError and diffDzError should match as well

This PR keeps ML b and tau taggers behavior unchanged, to keep values consistent with the training. @cms-sw/btv-pog-l2 @cms-sw/tau-pog-l2

slava77 avatar Jul 31 '24 23:07 slava77

looks like bot is asleep/broken

slava77 avatar Aug 01 '24 00:08 slava77

looks like bot is asleep/broken

jenkins console is showing smth like: image

slava77 avatar Aug 01 '24 00:08 slava77

cms-bot internal usage

cmsbuild avatar Aug 01 '24 06:08 cmsbuild

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45612/41112

  • There are other open Pull requests which might conflict with changes you have proposed:
    • File DataFormats/PatCandidates/src/classes_def_objects.xml modified in PR(s): #38999

cmsbuild avatar Aug 01 '24 06:08 cmsbuild

A new Pull Request was created by @slava77 for master.

It involves the following packages:

  • DQM/TrackingMonitor (dqm)
  • DataFormats/PatCandidates (reconstruction, xpog)
  • RecoBTag/FeatureTools (reconstruction)
  • RecoTauTag/RecoTau (reconstruction)

@antoniovagnerini, @cmsbuild, @ftorrresd, @hqucms, @jfernan2, @mandrenguyen, @nothingface0, @rvenditti, @syuvivida, @tjavaid, @vlimant can you please review it and eventually sign? Thanks. @AlexDeMoor, @JanFSchulte, @Ming-Yan, @Senphy, @VinInn, @VourMa, @andrzejnovak, @arossi83, @azotz, @castaned, @demuller, @fioriNTU, @gouskos, @gpetruc, @hatakeyamak, @hqucms, @idebruyn, @jandrea, @mbluj, @missirol, @mmusich, @mtosi, @rovere, @sroychow, @threus this is something you requested to watch as well. @antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

cmsbuild avatar Aug 01 '24 06:08 cmsbuild

@cmsbuild please test

swagata87 avatar Aug 01 '24 08:08 swagata87

+1

Size: This PR adds an extra 92KB to repository Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-42d9d6/40709/summary.html COMMIT: 588caa3669279dd0d17297992bd36e01faa7201c CMSSW: CMSSW_14_1_X_2024-07-31-2300/el8_amd64_gcc12 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45612/40709/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

cmsbuild avatar Aug 01 '24 12:08 cmsbuild

type bug-fix

mmusich avatar Aug 01 '24 12:08 mmusich

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

mbluj avatar Aug 01 '24 12:08 mbluj

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

isn't this what this is doing?

https://github.com/cms-sw/cmssw/pull/45612/files#diff-318e0c5f8a3329cecffc611f15d40f52e807c2888570a0ac7e8e13fe6504e102R782

mmusich avatar Aug 01 '24 12:08 mmusich

Does it make sense to add to PackedCandidate a method returning correctly dzError (as present in TrackBase)?

isn't this what this is doing?

https://github.com/cms-sw/cmssw/pull/45612/files#diff-318e0c5f8a3329cecffc611f15d40f52e807c2888570a0ac7e8e13fe6504e102R782

Yes, indeed. I must overlooked it.

mbluj avatar Aug 01 '24 12:08 mbluj

DQMHistoTests: Total failures: 370

Looking at 141.046_RunEGamma2023D , the only DQM difference is in the dzError comparisons between the original track and the packed cand:

image

this looks somewhat as expected now with a significant matching fraction compared to only accidental matches previously (IIUC, the "entries" in the stat box doesn't count the actual under/overflows displayed in the edge bins).

The agreement is not perfect due to the packing precision, although I can't exclude other issues. I picked a few cases in the new reco+mini files manually with higher eta and the agreement is fractionally good. Notably, the histogram filling (PackedCandidate::bestTrack() - reco::Track)/reco::Track in dzError() is made in range of [-0.05, 0.05] while here using the PackedCandidate directly it's ±0.001. So, outliers are expected .. or a broader range is needed (I hesitate to update it in this PR).

slava77 avatar Aug 01 '24 13:08 slava77

+reconstruction Comparisons as expected: https://github.com/cms-sw/cmssw/pull/45612#issuecomment-2263103544

mandrenguyen avatar Aug 01 '24 14:08 mandrenguyen

@mandrenguyen @cms-sw/reconstruction-l2 @cms-sw/xpog-l2 and all, how far back do you want to backport? I can think of 10_6, 12_X (12_6 ? or more), 13_X (13_3 ? or more), 14_0

slava77 avatar Aug 01 '24 15:08 slava77

@mandrenguyen @cms-sw/reconstruction-l2 @cms-sw/xpog-l2 and all, how far back do you want to backport? I can think of 10_6, 12_X (12_6 ? or more), 13_X (13_3 ? or more), 14_0

To give a non-answer I'd say as far back as whatever release we'd like to fix this for a future miniAOD campaign, and then every release in between, otherwise it will be confusing. Perhaps XPOG would have a more definite answer.

mandrenguyen avatar Aug 01 '24 15:08 mandrenguyen

for a backport I was thinking

  • in the PackedCandidate: only to add the dszError method and fix the dzError method; no update in the name of the PackedCovariance members. (to avoid the possible mess with class versions)
  • include everything else
    • ML b and tau unchanged
    • DQM and PATIsolatedTrackProducer updated

Note that the change in PackedCandidate::dzError will take effect immediately and for all available data, since it's the class method change, not a persisted value change.

Is this OK? I will make backports after this is merged.

slava77 avatar Aug 01 '24 16:08 slava77

Given that we only plan to do the next re-Mini with CMSSW_15_X, probably there is no need to backport this? I am slightly worried that a backport would create more confusion than the benefits of correctness...

hqucms avatar Aug 02 '24 14:08 hqucms

enable nano

hqucms avatar Aug 02 '24 14:08 hqucms

please test

hqucms avatar Aug 02 '24 14:08 hqucms

Given that we only plan to do the next re-Mini with CMSSW_15_X, probably there is no need to backport this? I am slightly worried that a backport would create more confusion than the benefits of correctness...

I'm not sure I follow why a re-Mini is the driving factor.

The main part of the PR is fixing the method that operates on already persisted data. I do additionally have a change in the data member name, but that's in a sense for clarity.

If the change in dzError method behavior is on the other hand the main concern, then this PR can not proceed regardless of re-Mini plans.

slava77 avatar Aug 02 '24 14:08 slava77

If the change in dzError method behavior is on the other hand the main concern, then this PR can not proceed regardless of re-Mini plans.

Indeed that's the main concern. To me it is probably OK for a major release (15_X) which we will use for the next re-MINI campaign (where people would expect changes), but I am not sure if people would expect any changes like this between minor versions of existing releases.

I guess a safer way is to remove dzError completely and introduce a new name for the method with the correct behavior? But I don't know what to call it 😂

hqucms avatar Aug 02 '24 15:08 hqucms

I guess a safer way is to remove dzError completely and introduce a new name for the method with the correct behavior? But I don't know what to call it 😂

The only practical way to be exposed is to throw an exception; simply removing will lead to the base class method Candidate::dzError { return 0; }

It sounds like your argument is that at some point a wrong implementation becomes a more acceptable truth and we are in this case.

slava77 avatar Aug 02 '24 15:08 slava77

-1

Failed Tests: AddOn Size: This PR adds an extra 12KB to repository Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-42d9d6/40735/summary.html COMMIT: 588caa3669279dd0d17297992bd36e01faa7201c CMSSW: CMSSW_14_1_X_2024-08-02-1100/el8_amd64_gcc12 Additional Tests: NANO User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45612/40735/install.sh to create a dev area with all the needed externals and cmssw changes.

AddOn Tests

  • unknown
AddOnTest might have timed out: FAILED -  secs

Comparison Summary

Summary:

NANO Comparison Summary

Summary:

  • You potentially removed 897 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 21
  • DQMHistoTests: Total histograms compared: 54913
  • DQMHistoTests: Total failures: 16
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 54897
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 20 files compared)
  • Checked 102 log files, 58 edm output root files, 21 DQM output files
  • TriggerResults: no differences found

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.001 2.785 2.785 0.000 ( +0.0% ) 3.34 3.16 +5.6% 6.070 6.032
2500.002 2.898 2.898 0.000 ( +0.0% ) 2.97 2.84 +4.6% 6.416 6.403
2500.003 2.844 2.844 0.000 ( +0.0% ) 3.07 2.94 +4.2% 6.386 6.376
2500.011 1.447 1.447 0.000 ( +0.0% ) 5.77 5.33 +8.2% 2.414 2.412
2500.012 1.907 1.907 0.000 ( +0.0% ) 3.16 3.00 +5.6% 2.605 2.210
2500.013 1.762 1.762 0.000 ( +0.0% ) 4.53 4.28 +5.8% 2.510 2.392
2500.021 0.022 0.022 0.000 ( +0.0% ) 0.96 0.88 +9.5% 2.234 2.205
2500.022 0.022 0.022 0.000 ( +0.0% ) 0.94 0.86 +9.3% 2.251 2.206
2500.023 0.022 0.022 0.000 ( +0.0% ) 0.91 0.84 +8.1% 2.187 2.139
2500.024 0.022 0.022 0.000 ( +0.0% ) 0.71 0.64 +10.5% 2.343 2.310
2500.031 0.035 0.035 0.000 ( +0.0% ) 0.87 0.78 +11.2% 2.390 2.283
2500.032 0.036 0.036 0.000 ( +0.0% ) 0.87 0.81 +7.1% 2.351 2.233
2500.033 0.037 0.037 0.000 ( +0.0% ) 0.81 0.73 +11.5% 2.441 2.327
2500.034 0.036 0.036 0.000 ( +0.0% ) 0.81 0.74 +9.6% 2.414 2.307
2500.101 2.643 2.643 0.000 ( +0.0% ) 9.03 8.16 +10.6% 6.903 6.292
2500.111 1.327 1.327 0.000 ( +0.0% ) 20.55 18.61 +10.4% 2.227 2.217
2500.112 1.732 1.732 0.000 ( +0.0% ) 15.24 13.58 +12.2% 2.137 2.077
2500.131 5.194 5.194 0.000 ( +0.0% ) 15.69 14.20 +10.5% 1.555 1.315
2500.201 2.475 2.475 0.000 ( +0.0% ) 7.58 6.79 +11.6% 5.542 6.094
2500.211 1.589 1.589 0.000 ( +0.0% ) 17.27 15.69 +10.1% 2.129 2.067
2500.212 2.030 2.030 0.000 ( +0.0% ) 13.71 12.31 +11.4% 2.178 2.109
2500.221 2.006 2.006 0.000 ( +0.0% ) 7.76 6.92 +12.1% 2.303 2.232
2500.222 3.072 3.072 0.000 ( +0.0% ) 7.55 6.79 +11.3% 2.345 2.293
2500.223 8.886 8.885 0.002 ( +0.0% ) 2.80 2.53 +10.7% 2.357 2.309
2500.224 5.512 5.512 0.000 ( +0.0% ) 1.11 0.98 +13.4% 2.531 2.388
2500.225 5.530 5.530 0.000 ( +0.0% ) 1.02 0.93 +9.3% 2.320 2.371
2500.226 2.970 2.970 0.000 ( +0.0% ) 7.43 6.73 +10.5% 2.329 2.285
2500.227 8.972 8.972 0.000 ( +0.0% ) 10.04 9.22 +8.9% 1.490 1.369
2500.231 1.407 1.407 0.000 ( +0.0% ) 13.48 12.47 +8.1% 1.950 1.944
2500.232 2.145 2.145 0.000 ( +0.0% ) 13.66 12.19 +12.1% 2.017 2.036
2500.233 4.669 4.668 0.001 ( +0.0% ) 4.67 4.38 +6.7% 2.055 2.021
2500.234 3.304 3.304 0.000 ( +0.0% ) 1.48 1.37 +7.9% 1.805 2.046
2500.235 3.315 3.315 0.000 ( +0.0% ) 1.40 1.26 +10.7% 1.831 2.077
2500.236 2.082 2.082 0.000 ( +0.0% ) 13.76 12.30 +11.9% 2.033 2.038
2500.237 7.977 7.977 0.000 ( +0.0% ) 14.45 13.41 +7.8% 1.455 1.391
2500.241 9.405 9.405 0.000 ( +0.0% ) 3.64 3.34 +8.8% 1.814 1.766
2500.242 10.331 10.331 0.000 ( +0.0% ) 0.89 0.80 +11.0% 1.662 1.667
2500.243 2.712 2.712 0.000 ( +0.0% ) 8.46 6.84 +23.8% 1.070 1.068
2500.244 485.976 485.976 0.000 ( +0.0% ) 0.56 0.51 +9.9% 1.685 1.617
2500.245 823.224 823.224 0.000 ( +0.0% ) 0.74 0.67 +10.2% 1.620 1.546
2500.901 1.749 1.749 0.000 ( +0.0% ) 21.20 19.44 +9.0% 1.784 1.822
2500.902 1.598 1.598 0.000 ( +0.0% ) 21.35 18.06 +18.3% 1.757 1.733
2500.911 13.931 13.931 0.000 ( +0.0% ) 3.13 2.58 +21.3% 1.081 1.078
2500.912 0.171 0.199 -0.029 ( -14.5% ) 1.56 0.96 +62.4% 0.972 0.968
2500.913 0.110 0.110 0.000 ( +0.0% ) 1.11 1.14 -2.6% 0.975 0.970

cmsbuild avatar Aug 02 '24 17:08 cmsbuild

@hqucms @cms-sw/xpog-l2 So, what is your preference? From the tracking point of view it may be simpler to remove the direct dz and dxy methods and redirect instead to the bestTrack

slava77 avatar Aug 20 '24 15:08 slava77

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

hqucms avatar Aug 21 '24 15:08 hqucms

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

should this new method be added to the reco::Candidate class?

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

slava77 avatar Aug 21 '24 17:08 slava77

@slava77 How about we throw an exception in the original PackedCandidate::dzError() method, and give the one with the correct behavior a slightly different name or function signature (e.g. float dzError(bool))?

should this new method be added to the reco::Candidate class?

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate?

For me deltaZerror() looks good too.

hqucms avatar Aug 22 '24 15:08 hqucms

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate?

For me deltaZerror() looks good too.

Uhm, I suspect that the design plans were to keep the code that works for objects deriving from RecoCandidate to work in the same way for PackedCandidate; hence the addition of dzError (and dxyError etc) in the common inheritance point in reco::Candidate. This PR now still follows that.

The proposal now discussed is instead to break this, right? The cost to keep it working would be an explicit dynamic cast at least around the dzError() calls in a user code; which I guess is acceptable considering the rest of CMSSW code base that already typically does that casting, e.g. PF vs packed.

slava77 avatar Aug 22 '24 16:08 slava77

dzError(bool) may be confusing, but OK (I couldn't come up with something much better; my ideas were more in the direction of rephrasing, like deltaZerror() , more to at least be clearly different in spelling)

I think it only needs to be added to PackedCandidate? For me deltaZerror() looks good too.

Uhm, I suspect that the design plans were to keep the code that works for objects deriving from RecoCandidate to work in the same way for PackedCandidate; hence the addition of dzError (and dxyError etc) in the common inheritance point in reco::Candidate. This PR now still follows that.

The proposal now discussed is instead to break this, right? The cost to keep it working would be an explicit dynamic cast at least around the dzError() calls in a user code; which I guess is acceptable considering the rest of CMSSW code base that already typically does that casting, e.g. PF vs packed.

Unfortunately yes... But given that the calls of dzError() identified in this PR all explicitly cast to PackedCandidate, I think it is probably fine.

hqucms avatar Aug 22 '24 17:08 hqucms

@slava77 @cms-sw/xpog-l2 @cms-sw/orp-l2 Please remind us of the status of this PR in the meeting tomorrow.

antoniovilela avatar Aug 26 '24 22:08 antoniovilela