cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

[13_0_X] Improve memory usage in ParameterSet

Open makortel opened this issue 1 year ago • 26 comments

PR description:

This PR backports https://github.com/cms-sw/cmssw/pull/42742 in order to allow a file written in 14_0_X to be read by a 13_0_X cmsRun in order to run the HLT step in 13_0_X in the upcoming 2023 MC campaign.

The last commit was needed to get the backport to compile, because of https://github.com/cms-sw/cmssw/pull/43898 was in 13_0_X already before the backport.

Note that files written with a release containing this PR will be unreadable by earlier 13_0_X releases. Therefore this PR is to be merged only in the 13_0_HLT_X branch.

Resolves https://github.com/cms-sw/framework-team/issues/918

PR validation:

I modified the test_MC_23_setup test, added in https://github.com/cms-sw/cmssw/pull/44578, to use my local 13_0_18 + this PR developer area and to fail if the HLT step fails, and the test passed.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Backport of https://github.com/cms-sw/cmssw/pull/42742

makortel avatar May 07 '24 14:05 makortel

A new Pull Request was created by @makortel for CMSSW_13_0_X.

It involves the following packages:

  • DQMOffline/Trigger (dqm)
  • FWCore/Framework (core)
  • FWCore/Integration (core)
  • FWCore/ParameterSet (core)
  • IOPool/Common (core)
  • IOPool/Input (core)
  • SimGeneral/HepPDTRecord (simulation)

@mdhildreth, @antoniovagnerini, @smuzaffar, @rvenditti, @syuvivida, @cmsbuild, @nothingface0, @tjavaid, @civanch, @makortel, @Dr15Jones can you please review it and eventually sign? Thanks. @HuguesBrun, @mtosi, @cericeci, @missirol, @Fedespring, @wddgit, @jhgoh, @rociovilar, @slomeo, @trocino, @fabiocos this is something you requested to watch as well. @rappoccio, @sextonkennedy, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

  • Backported from #42742

cmsbuild avatar May 07 '24 14:05 cmsbuild

cms-bot internal usage

cmsbuild avatar May 07 '24 14:05 cmsbuild

hold

makortel avatar May 07 '24 14:05 makortel

Pull request has been put on hold by @makortel They need to issue an unhold command to remove the hold state or L1 can unhold it for all

cmsbuild avatar May 07 '24 14:05 cmsbuild

@cms-sw/orp-l2 @cms-sw/pdmv-l2 @cms-sw/ppd-l2 Few weeks back we discussed in ORP that this PR would be merged into a special e.g. 13_0_18_HLT release that would then be used for the HLT step in the upcoming MC production (because files written with this PR would be unreadable with 13_0_X releases without this PR). Just to confirm, is still this the case?

@smuzaffar What kind of additional setup we'd need for that special release?

(I assume everything agreed here will apply similarly to a future 12_4_X backport PR)

makortel avatar May 07 '24 14:05 makortel

@cmsbuild, please test

Let's test it anyhow

makortel avatar May 07 '24 14:05 makortel

@smuzaffar What kind of additional setup we'd need for that special release?

@makortel , I think creating a cms-sw/cmssw CMSSW_13_0_HLT_X branch should be enough take care of 13_0_18_HLT release. This PR should be then merged in to CMSSW_13_0_HLT_X branch. All the changes in CMSSW_13_0_X branch can be automatically forward ported the CMSSW_13_0_HLT_X

By the way, we also have dedicated CMSSW_13_0_HeavyIon_X branch for HI changes

smuzaffar avatar May 07 '24 14:05 smuzaffar

-1

Failed Tests: UnitTests Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2073d2/39281/summary.html COMMIT: 5916b3404a8fc95a1480a67e55815743596ccce8 CMSSW: CMSSW_13_0_X_2024-05-05-0000/el8_amd64_gcc11 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44921/39281/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 10 errors in the following unit tests:

---> test test_CreateFileLists had ERRORS
---> test test-das-selected-lumis had ERRORS
---> test validateAlignments had ERRORS
and more ...

Comparison Summary

Summary:

  • You potentially removed 6 lines from the logs
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3282424
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3282393
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -308.312 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 1000.0 ): -308.312 KiB HLT/EGM
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar May 07 '24 19:05 cmsbuild

The unit test failures seem to be caused by an error in creating a proxy certificate? (at least for those the error was clearly visible)

makortel avatar May 07 '24 19:05 makortel

Few weeks back we discussed in ORP that this PR would be merged into a special e.g. 13_0_18_HLT release that would then be used for the HLT step in the upcoming MC production (because files written with this PR would be unreadable with 13_0_X releases without this PR). Just to confirm, is still this the case?

@cms-sw/pdmv-l2 @cms-sw/ppd-l2 Could you please confirm if the plan above still holds?

makortel avatar May 13 '24 14:05 makortel

Few weeks back we discussed in ORP that this PR would be merged into a special e.g. 13_0_18_HLT release that would then be used for the HLT step in the upcoming MC production (because files written with this PR would be unreadable with 13_0_X releases without this PR). Just to confirm, is still this the case?

@cms-sw/pdmv-l2 @cms-sw/ppd-l2 Could you please confirm if the plan above still holds?

Hi Matti, sorry for the delay. Yes the plan is still that.

AdrianoDee avatar May 13 '24 15:05 AdrianoDee

Thanks @AdrianoDee. @smuzaffar Could you create CMSSW_13_0_HLT_X and CMSSW_12_4_HLT_X branches, and I'll adjust the base branch of this PR (and open another one for the 12_4 backport)?

makortel avatar May 13 '24 16:05 makortel

@makortel 13.0.HLT and 12.4.HLT IBs are now active ( https://github.com/cms-sw/cms-bot/pull/2229). You need to change the base branch for this PR to CMSSW_13_0_HLT_X.

smuzaffar avatar May 14 '24 11:05 smuzaffar

Thanks!

makortel avatar May 14 '24 13:05 makortel

unhold

makortel avatar May 14 '24 13:05 makortel

@cmsbuild, please test

makortel avatar May 14 '24 13:05 makortel

@cmsbuild, please abort

makortel avatar May 16 '24 16:05 makortel

@cmsbuild, please test

makortel avatar May 16 '24 16:05 makortel

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2073d2/39414/summary.html COMMIT: 5916b3404a8fc95a1480a67e55815743596ccce8 CMSSW: CMSSW_13_0_HLT_X_2024-05-14-2300/el8_amd64_gcc11 User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44921/39414/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 237 lines from the logs
  • Reco comparison results: 15 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3282428
  • DQMHistoTests: Total failures: 983
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3281423
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -308.312 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 1000.0 ): -308.312 KiB HLT/EGM
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

cmsbuild avatar May 16 '24 21:05 cmsbuild

Comparisons show https://github.com/cms-sw/cmssw/issues/39803

makortel avatar May 16 '24 21:05 makortel

+core

makortel avatar May 16 '24 21:05 makortel

@cms-sw/orp-l2 Please double check the base branch before merging :)

makortel avatar May 16 '24 21:05 makortel

@cms-sw/dqm-l2 @cms-sw/simulation-l2 Could you please review and sign this backport?

makortel avatar May 17 '24 19:05 makortel

backport of https://github.com/cms-sw/cmssw/pull/42742

makortel avatar May 17 '24 19:05 makortel

+1

fine to sim but test is not complete?

civanch avatar May 17 '24 19:05 civanch

The tests on top of 13_0_HLT_X https://github.com/cms-sw/cmssw/pull/44921#issuecomment-2116192342 succeeded, but maybe the bot still looks for the 13_0_X tests in https://github.com/cms-sw/cmssw/pull/44921#issuecomment-2099136671 for the test label?

makortel avatar May 17 '24 19:05 makortel

@cms-sw/dqm-l2 Could you review and sign? Thanks!

makortel avatar May 21 '24 15:05 makortel

+1

tjavaid avatar May 22 '24 02:05 tjavaid

This pull request is fully signed and it will be integrated in one of the next CMSSW_13_0_HLT_X IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

cmsbuild avatar May 22 '24 02:05 cmsbuild

+1

rappoccio avatar May 22 '24 16:05 rappoccio