Remove unused L1TCaloSummary DQM sequences causing issues with HCAL.
PR description:
This PR removes two unused sequences in the CICADA DQM that were potentially causing duplicate TPs seen in HCAL tests. No DQM changes are expected as a result of this change.
Also changes Calo Summary DQM configuration to use GT digis instead of test crate digis, which is more correct.
@missirol FYI @JHiltbrand Also FYI, but could I ask you to see if you can rerun the test you were looking at with this PR and see if this resolves the issue please?
PR validation:
Changed HCAL workflow was run-again to make sure that the sequence does not crash.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
This PR is not a backport, but may need backporting to online/offline DQM releases.
cms-bot internal usage
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46482/42337
A new Pull Request was created by @aloeliger for master.
It involves the following packages:
- DQM/L1TMonitor (dqm)
@antoniovagnerini, @cmsbuild, @nothingface0, @rvenditti, @syuvivida, @tjavaid can you please review it and eventually sign? Thanks. @missirol, @mmusich this is something you requested to watch as well. @antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here
Hi @aloeliger ,
Thanks for making the PR, I will test this in the local workflow I have and report back
Hi @aloeliger ,
I pulled your PR into my working area for 14_1_0, and reran my step3. I now find the number of processed HCAL TPs in the HcalDigis validation is exactly the same between 14_1_0+PR and 14_1_0_pre7 🎉
Please test
+1
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3bcdd1/42346/summary.html
COMMIT: d0756ec5b1990eb8187b7fdb83c290420409bf66
CMSSW: CMSSW_14_2_X_2024-10-22-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/46482/42346/install.sh to create a dev area with all the needed externals and cmssw changes.
Comparison Summary
Summary:
- You potentially added 1 lines to the logs
- Reco comparison results: 8 differences found in the comparisons
- DQMHistoTests: Total files compared: 46
- DQMHistoTests: Total histograms compared: 3566331
- DQMHistoTests: Total failures: 808
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3565503
- DQMHistoTests: Total skipped: 20
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
- Checked 201 log files, 171 edm output root files, 46 DQM output files
- TriggerResults: no differences found
Thanks. N_TPs returned back to regular
Apparently needs to be backported at least to 14_0_X for coming 2024 MC 🤔
urgent
indeed, it would be great to have a backport to 14_0 and 14_1 asap. Even though it is clear at this point that there is no issue with the HCAL TP, but rather an issue in the validation procedure.
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46482/42342
Pull request #46482 was updated. @antoniovagnerini, @cmsbuild, @nothingface0, @rvenditti, @syuvivida, @tjavaid can you please check and sign again.
please test
DQMHistoTests: Total failures: 808
We observe some differences in both the HCALDigisV and ECALDigisV , e.g in workflow 10224.0 I suppose this is coming form the Calo Summary DQM configuration change to GT digis from test crate digis. Could you please confirm @aloeliger ?
@antoniovagnerini A large amount of the failures (>300) seem to be coming from HGCAL in workflows 24834.911, 29634.0, 29634.911. The runners up are HCAL and ECAL Digi comparison failures. I would expect the issue here is that this DQM sequence has been present for long enough that it was in ECAL/HCAL DQM comparison plots so the fix this is implementing is being compared against broken plots.
@antoniovagnerini A large amount of the failures (>300) seem to be coming from HGCAL in workflows 24834.911, 29634.0, 29634.911. The runners up are HCAL and ECAL Digi comparison failures. I would expect the issue here is that this DQM sequence has been present for long enough that it was in ECAL/HCAL DQM comparison plots so the fix this is implementing is being compared against broken plots.
Thanks for the feedback, the HGCAL DQM failures can be ascribed to a separate known issue (#46416 ), I was referring specifically to the HCAL and ECAL Digi comparison.
please abort
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46482/42344
Pull request #46482 was updated. @antoniovagnerini, @cmsbuild, @nothingface0, @rvenditti, @syuvivida, @tjavaid can you please check and sign again.
please test
-1
Failed Tests: RelVals-INPUT
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3bcdd1/42355/summary.html
COMMIT: a4df47ee268b7579d21fbbd4a5310952c4c44fa4
CMSSW: CMSSW_14_2_X_2024-10-23-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/46482/42355/install.sh to create a dev area with all the needed externals and cmssw changes.
RelVals-INPUT
- 134.813
134.813_RunCosmics2015C/step2_RunCosmics2015C.log
Comparison Summary
Summary:
- You potentially added 1 lines to the logs
- Reco comparison results: 7 differences found in the comparisons
- DQMHistoTests: Total files compared: 46
- DQMHistoTests: Total histograms compared: 3566331
- DQMHistoTests: Total failures: 816
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3565495
- DQMHistoTests: Total skipped: 20
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
- Checked 201 log files, 171 edm output root files, 46 DQM output files
- TriggerResults: no differences found
Hmm, this failure is some strange backup behavior of the CICADA emulator that is a little hardcoded. This can be configured around, but the CICADA emulator person (me) should really fix this for the long term.
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46482/42353
please test
Pull request #46482 was updated. @aloeliger, @antoniovagnerini, @epalencia, @nothingface0, @rvenditti, @syuvivida, @tjavaid can you please check and sign again.
+1
Size: This PR adds an extra 28KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3bcdd1/42365/summary.html
COMMIT: 857439796738c7a47fb53f1e181c006c789bd8ae
CMSSW: CMSSW_14_2_X_2024-10-23-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/46482/42365/install.sh to create a dev area with all the needed externals and cmssw changes.
Comparison Summary
Summary:
- You potentially added 1 lines to the logs
- Reco comparison results: 11 differences found in the comparisons
- DQMHistoTests: Total files compared: 46
- DQMHistoTests: Total histograms compared: 3566343
- DQMHistoTests: Total failures: 810
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3565513
- DQMHistoTests: Total skipped: 20
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
- Checked 201 log files, 171 edm output root files, 46 DQM output files
- TriggerResults: no differences found
Some differences are observed in the bin-by-bin comparison in the WF 12846.0 in HLT and 13034.0 in the HLT/L1T emulator, which were not present in the first round of tests with the PR. For instance in the HCALDigiTask for the WF 12846.0 (https://tinyurl.com/24sy4xno).
and in L1Temulator for the
13034.0 WF Comparison GUI. In my understanding this PR shouldnt be modifying the results of the L1T emulation. Please confirm @missirol
In my understanding this PR shouldnt be modifying the results of the L1T emulation.
In my understanding, it should modify them, restoring the correct ones (meaning, baseline results are bugged).
Please confirm @missirol
I'm not the relevant expert. I think the question should be directed to @cms-sw/l1-l2.
Note that there is a comment to address in https://github.com/cms-sw/cmssw/pull/46482#discussion_r1812634806.
In my understanding this PR shouldnt be modifying the results of the L1T emulation.
In my understanding, it should modify them, restoring the correct ones (meaning, baseline results are bugged).
Please confirm @missirol
I'm not the relevant expert. I think the question should be directed to @cms-sw/l1-l2.
This is my understanding as well