cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

[12_5_X] Update SiStrip and SiPixel bad components for Run 3 MC GTs

Open francescobrivio opened this issue 3 years ago • 13 comments

PR description:

Backport of #39645 This PR updates, in the Run 3 realistic MC GTs, the SiPixel (CMSTalk request) and SiStrip (CMSTalk request) bad components tags.

See master PR for list of updated tags and GT differences.

PR validation:

Tested with: runTheMatrix.py -l 11634.0,7.23,159.0,12434.0,12834.0 --ibeos -j 16

Backport:

Backport of #39645

francescobrivio avatar Oct 06 '22 09:10 francescobrivio

A new Pull Request was created by @francescobrivio for CMSSW_12_5_X.

It involves the following packages:

  • Configuration/AlCa (alca)

@malbouis, @yuanchao, @cmsbuild, @saumyaphor4252, @francescobrivio, @ChrisMisan, @tvami can you please review it and eventually sign? Thanks. @Martin-Grunewald, @missirol, @mmusich, @fabiocos, @tocheng this is something you requested to watch as well. @perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

  • Backported from #39645

cmsbuild avatar Oct 06 '22 09:10 cmsbuild

type trk

francescobrivio avatar Oct 06 '22 09:10 francescobrivio

backport of #39645

francescobrivio avatar Oct 06 '22 09:10 francescobrivio

test parameters:

  • workflows = 11634.0,7.23,159.0,12434.0,12834.0

francescobrivio avatar Oct 06 '22 09:10 francescobrivio

@cmsbuild please test

francescobrivio avatar Oct 06 '22 09:10 francescobrivio

-1

Failed Tests: UnitTests Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4acbc1/28054/summary.html COMMIT: 598a1b87555c9431b3ccd0176303533f50dfcee7 CMSSW: CMSSW_12_5_X_2022-10-04-1100/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39646/28054/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test TestDQMOnlineClient-hcalreco_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-dt4ml_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-beampixel_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-beam_dqm_sourceclient had ERRORS
and more ...

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /pool/condor/dir_26258/jenkins/workspace/compare-root-files-short-matrix/data/PR-4acbc1/7.23_Cosmics_UP21+Cosmics_UP21+DIGICOS_UP21+RECOCOS_UP21+ALCACOS_UP21+HARVESTCOS_UP21

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 19902 differences found in the comparisons
  • DQMHistoTests: Total files compared: 54
  • DQMHistoTests: Total histograms compared: 3994404
  • DQMHistoTests: Total failures: 236477
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3757905
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -4.244 KiB( 53 files compared)
  • DQMHistoSizes: changed ( 11834.0 ): 0.997 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 7.23 ): -5.241 KiB SiStrip/MechanicalView
  • Checked 227 log files, 49 edm output root files, 54 DQM output files
  • TriggerResults: found differences in 7 / 53 workflows

cmsbuild avatar Oct 06 '22 17:10 cmsbuild

The error seems unrelated

----- Begin Fatal Exception 06-Oct-2022 11:34:25 CEST-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named ./src/DQM/Integration/python/clients/es_dqm_sourceclient-live_cfg.py
Exception Message:
 unknown python problem occurred.
IndexError: list index out of range

At:

tvami avatar Oct 06 '22 17:10 tvami

@cmsbuild please test

  • trying again

francescobrivio avatar Oct 07 '22 07:10 francescobrivio

@francescobrivio

trying again

as far as I can tell, there's no point in trying again, the tests are failing in the IB as well: https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc10/CMSSW_12_5_X_2022-10-06-2300/unitTestLogs/DQM/Integration#/

mmusich avatar Oct 07 '22 07:10 mmusich

@cmsbuild please abort

francescobrivio avatar Oct 07 '22 07:10 francescobrivio

@francescobrivio

trying again

as far as I can tell, there's no point in trying again, the tests are failing in the IB as well: https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc10/CMSSW_12_5_X_2022-10-06-2300/unitTestLogs/DQM/Integration#/

Thanks Marco I had missed the failure in the IBs! I'll wait for the tests to be working again then.

francescobrivio avatar Oct 07 '22 07:10 francescobrivio

I'll wait for the tests to be working again then.

the tests complain (presumably) about missing files:

Error in <TNetXNGFile::Open>: [ERROR] Server responded with an error: [3011] No servers are available to read the file.

but the very same tests (using the very same files) are running OK in 12.6.X:

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc10/CMSSW_12_6_X_2022-10-06-2300/unitTestLogs/DQM/Integration#/

mmusich avatar Oct 07 '22 07:10 mmusich

@francescobrivio @mmusich I've opened https://github.com/cms-sw/cmssw/issues/39669 to report the issue

perrotta avatar Oct 07 '22 08:10 perrotta

@cmsbuild please test

perrotta avatar Oct 19 '22 08:10 perrotta

-1

Failed Tests: UnitTests Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4acbc1/28353/summary.html COMMIT: 598a1b87555c9431b3ccd0176303533f50dfcee7 CMSSW: CMSSW_12_5_X_2022-10-18-2300/el8_amd64_gcc10 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39646/28353/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test TestDQMOnlineClient-hcalreco_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-fed_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-beam_dqm_sourceclient had ERRORS
---> test TestDQMOnlineClient-ecal_dqm_sourceclient had ERRORS
and more ...

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /pool/condor/dir_8307/jenkins/workspace/compare-root-files-short-matrix/data/PR-4acbc1/7.23_Cosmics_UP21+Cosmics_UP21+DIGICOS_UP21+RECOCOS_UP21+ALCACOS_UP21+HARVESTCOS_UP21

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 19902 differences found in the comparisons
  • DQMHistoTests: Total files compared: 54
  • DQMHistoTests: Total histograms compared: 3991793
  • DQMHistoTests: Total failures: 26710
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3965061
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -4.244 KiB( 53 files compared)
  • DQMHistoSizes: changed ( 7.23 ): -5.241 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 11834.0 ): 0.997 KiB SiStrip/MechanicalView
  • Checked 227 log files, 49 edm output root files, 54 DQM output files
  • TriggerResults: found differences in 7 / 53 workflows

cmsbuild avatar Oct 19 '22 12:10 cmsbuild

@perrotta seems like the issue already reported above is not yet solved since these failures look the same as before...

francescobrivio avatar Oct 19 '22 12:10 francescobrivio

+alca

  • the unit test failure is known (there is a github issue about it)
  • otherwise tests pass

tvami avatar Oct 19 '22 12:10 tvami

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_5_X IBs (but tests are reportedly failing) and once validation in the development release cycle CMSSW_12_6_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

cmsbuild avatar Oct 19 '22 12:10 cmsbuild

seems like the issue already reported above is not yet solved since these failures look the same as before...

from yesterday's ORP meeting minutes

DQM/Integration unit tests are failing in all releases but 12_6_X (github issue https://github.com/cms-sw/cmssw/issues/39669): do not rush to backport fixes in the production releases, as drawbacks are possible

not really sure what is the backport that needs to be done and what are its drawbacks though.

mmusich avatar Oct 19 '22 12:10 mmusich

seems like the issue already reported above is not yet solved since these failures look the same as before...

from yesterday's ORP meeting minutes

DQM/Integration unit tests are failing in all releases but 12_6_X (github issue #39669): do not rush to backport fixes in the production releases, as drawbacks are possible

not really sure what is the backport that needs to be done and what are its drawbacks though.

Thanks Marco for pointing out yesterday's minutes, I had missed that point. Incidentally this 125X backport PR was opened 13 days ago and the tests were re-triggered this morning by release managers, so I was assuming some progress on the failing IBs/tests had been made. 😄

francescobrivio avatar Oct 19 '22 12:10 francescobrivio

@francescobrivio I restarted the tests this morning because the PR was in the "test pending" status after https://github.com/cms-sw/cmssw/pull/39646#issuecomment-1271211601 As @mmusich pointed out these AddOn errors will stay in status "won't solve", at least for some while

perrotta avatar Oct 19 '22 14:10 perrotta

so we are fully signed now @perrotta and the master of this PR is ok in the IBs, shall this be merged?

tvami avatar Oct 19 '22 14:10 tvami

+1

perrotta avatar Oct 19 '22 14:10 perrotta

merge

perrotta avatar Oct 19 '22 14:10 perrotta