[RFC] Use ROOT lossy compression for P3 and position of reco::Track
PR description:
- added classes used by ROOT when doing storage which specify lossy compression
- added iotypes rule to use new classes
PR validation:
Ran on workflow 11834.21 and say more than 10% decrease in AOD file size.
This is intended to be used for validation on a separate IB.
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39554/32342
- This PR adds an extra 24KB to repository
A new Pull Request was created by @Dr15Jones (Chris Jones) for master.
It involves the following packages:
- DataFormats/TrackReco (reconstruction)
@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks. @VourMa, @JanFSchulte, @rovere, @VinInn, @missirol, @gpetruc, @mmusich, @mtosi this is something you requested to watch as well. @perrotta, @dpiparo, @rappoccio you are the release manager for this.
cms-bot commands are listed here
please test
-1
Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7f2e52/27883/summary.html
COMMIT: 0f99936a26b2edba04abf4e9428911d6ecc1a33a
CMSSW: CMSSW_12_6_X_2022-09-30-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/39554/27883/install.sh to create a dev area with all the needed externals and cmssw changes.
Build
I found compilation error when building:
>> Building LCG reflex dict from header file src/DataFormats/GsfTrackReco/src/classes.h >> Compiling LCG dictionary: tmp/el8_amd64_gcc10/src/DataFormats/GsfTrackReco/src/DataFormatsGsfTrackReco/a/DataFormatsGsfTrackReco_xr.cc >> Building shared library tmp/el8_amd64_gcc10/src/DataFormats/GsfTrackReco/src/DataFormatsGsfTrackReco/libDataFormatsGsfTrackReco.so Copying tmp/el8_amd64_gcc10/src/DataFormats/GsfTrackReco/src/DataFormatsGsfTrackReco/libDataFormatsGsfTrackReco.so to productstore area: >> Checking EDM Class Version for src/DataFormats/GsfTrackReco/src/classes_def.xml in libDataFormatsGsfTrackReco.so error: class 'reco::GsfTrack' has a different checksum for ClassVersion 20. Increment ClassVersion to 21 and assign it to checksum 1617233394 Suggestion: You can run 'scram build updateclassversion' to generate src/DataFormats/GsfTrackReco/src/classes_def.xml.generated with updated ClassVersion gmake: *** [tmp/el8_amd64_gcc10/src/DataFormats/GsfTrackReco/src/DataFormatsGsfTrackReco/libDataFormatsGsfTrackReco.so] Error 1 Leaving library rule at DataFormats/GsfTrackReco >> Leaving Package DataFormats/GsfTrackReco >> Package DataFormats/GsfTrackReco built
please test
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39554/32344
- This PR adds an extra 28KB to repository
Pull request #39554 was updated. @mandrenguyen, @clacaputo can you please check and sign again.
please test
-1
Failed Tests: UnitTests RelVals RelVals-INPUT AddOn
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7f2e52/27887/summary.html
COMMIT: 1cc2bbf6dd8723fffcba7b7e60117930c63ab1d1
CMSSW: CMSSW_12_6_X_2022-09-30-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/39554/27887/install.sh to create a dev area with all the needed externals and cmssw changes.
Unit Tests
I found errors in the following unit tests:
---> test testSSTGainPCL_fromRECO had ERRORS ---> test testCalibTrackerSiStripCommon had ERRORS ---> test testBeamSpotWorkflow had ERRORS ---> test testMiscellanea had ERRORS and more ...
RelVals
----- Begin Fatal Exception 30-Sep-2022 21:43:57 CEST-----------------------
An exception of category 'FileReadError' occurred while
[0] Processing Event run: 319450 lumi: 31 event: 42789123 stream: 0
[1] Running path 'dqmoffline_step'
[2] Prefetching for module NanoAODDQM/'nanoDQM'
[3] Prefetching for module TriggerObjectTableProducer/'triggerObjectTable'
[4] While reading from source BXVector<l1t::EGamma> caloStage2Digis 'EGamma' RECO
[5] Rethrowing an exception that happened on a different read request.
[6] Processing Event run: 319450 lumi: 31 event: 42789123 stream: 0
[7] Running path 'dqmoffline_step'
[8] Prefetching for module NanoAODDQM/'nanoDQM'
[9] Prefetching for module SimpleCandidateFlatTableProducer/'boostedTauTable'
[10] Prefetching for module PATTauRefSelector/'finalBoostedTaus'
[11] Prefetching for module PATTauIDEmbedder/'slimmedTausBoostedNewID'
[12] While reading from source std::vector<pat::Tau> slimmedTausBoosted '' PAT
[13] Reading branch patTaus_slimmedTausBoosted__PAT.
Additional Info:
[a] Fatal Root Error: @SUB=TStreamerInfo::BuildOld
Cannot convert reco::TrackBase::vertex_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag> to type: reco::storage::TrackPositionStorage, skip element
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 30-Sep-2022 21:44:33 CEST-----------------------
An exception of category 'FileReadError' occurred while
[0] Processing Event run: 277069 lumi: 81 event: 36026102 stream: 0
[1] Running path 'Flag_HcalStripHaloFilter'
[2] Prefetching for module HcalStripHaloFilter/'HcalStripHaloFilter'
[3] While reading from source reco::BeamHaloSummary BeamHaloSummary '' RECO
[4] Rethrowing an exception that happened on a different read request.
[5] Processing Event run: 277069 lumi: 81 event: 36026102 stream: 0
[6] Running path 'dqmofflineOnPAT_step'
[7] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
[8] Prefetching for module PATMuonSlimmer/'slimmedMuons'
[9] Prefetching for module PATMuonSelector/'selectedPatMuons'
[10] Prefetching for module PATMuonProducer/'patMuons'
[11] Prefetching for module CITKPFIsolationSumProducerForPUPPI/'muonPUPPIIsolation'
[12] Prefetching for module PATPackedCandidateProducer/'packedPFCandidates'
[13] Prefetching for module PATVertexSlimmer/'offlineSlimmedPrimaryVertices'
[14] While reading from source std::vector<reco::Vertex> offlinePrimaryVertices '' RECO
[15] Reading branch recoVertexs_offlinePrimaryVertices__RECO.
Additional Info:
[a] Fatal Root Error: @SUB=TStreamerInfo::BuildOld
Cannot convert reco::TrackBase::vertex_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag> to type: reco::storage::TrackPositionStorage, skip element
----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 30-Sep-2022 21:44:33 CEST-----------------------
An exception of category 'FileReadError' occurred while
[0] Processing Event run: 305064 lumi: 36 event: 55020723 stream: 0
[1] Running path 'Flag_HcalStripHaloFilter'
[2] Prefetching for module HcalStripHaloFilter/'HcalStripHaloFilter'
[3] While reading from source reco::BeamHaloSummary BeamHaloSummary '' RECO
[4] Rethrowing an exception that happened on a different read request.
[5] Processing Event run: 305064 lumi: 36 event: 55020723 stream: 0
[6] Running path 'dqmofflineOnPAT_step'
[7] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
[8] Prefetching for module PATMuonSlimmer/'slimmedMuons'
[9] Prefetching for module PATMuonSelector/'selectedPatMuons'
[10] Prefetching for module PATMuonProducer/'patMuons'
[11] Prefetching for module CITKPFIsolationSumProducerForPUPPI/'muonPUPPIIsolation'
[12] Prefetching for module PATPackedCandidateProducer/'packedPFCandidates'
[13] Prefetching for module PATVertexSlimmer/'offlineSlimmedPrimaryVertices'
[14] While reading from source std::vector<reco::Vertex> offlinePrimaryVertices '' RECO
[15] Reading branch recoVertexs_offlinePrimaryVertices__RECO.
Additional Info:
[a] Fatal Root Error: @SUB=TStreamerInfo::BuildOld
Cannot convert reco::TrackBase::vertex_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag> to type: reco::storage::TrackPositionStorage, skip element
----- End Fatal Exception -------------------------------------------------
RelVals-INPUT
- 134.808
134.808_RunSingleMuPrpt2015C+RunSingleMuPrpt2015C+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2/step2_RunSingleMuPrpt2015C+RunSingleMuPrpt2015C+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2.log - 134.807
134.807_RunDoubleEGPrpt2015C+RunDoubleEGPrpt2015C+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2/step2_RunDoubleEGPrpt2015C+RunDoubleEGPrpt2015C+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2.log - 134.908
134.908_RunSingleMuPrpt2015D+RunSingleMuPrpt2015D+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2/step2_RunSingleMuPrpt2015D+RunSingleMuPrpt2015D+HLTDR2_25ns+RECODR2_25nsreHLT_HIPM+HARVESTDR2.log
Expand to see more relval errors ...
- 134.907
- 136.727
- 136.728
- 136.72411
- 136.72412
- 136.7611
- 136.76111
- 136.7802
- 136.7722
- 136.7801
- 136.77211
- 136.7721
- 136.7803
- 136.7952
- 136.8391
- 136.8311
- 136.83111
- 136.8523
- 136.8521
- 136.8522
- 138.1
- 138.2
- 136.88811
- 140.5611
- 159.01
- 158.01
- 1325.61
- 1325.517
- 1325.5
- 1325.6
- 1325.51
- 1325.516
- 1325.5161
- 1325.81
- 1325.8
- 1325.7
- 1325.518
- 1329.1
AddOn Tests
----- Begin Fatal Exception 30-Sep-2022 21:42:37 CEST-----------------------
An exception of category 'FileReadError' occurred while
[0] Processing Event run: 1 lumi: 1 event: 2 stream: 0
[1] Running path 'p'
[2] Prefetching for module CandidateSummaryTable/'selectedPatCandidateSummary'
[3] Prefetching for module PATJetSelector/'selectedPatJets'
[4] Prefetching for module PATJetProducer/'patJets'
[5] While reading from source std::vector<reco::PFJet> ak4PFJetsCHS '' RECO
[6] Rethrowing an exception that happened on a different read request.
[7] Processing Event run: 1 lumi: 1 event: 2 stream: 0
[8] Running path 'p'
[9] Prefetching for module CandidateSummaryTable/'selectedPatCandidateSummary'
[10] Prefetching for module PATElectronSelector/'selectedPatElectrons'
[11] Prefetching for module PATElectronProducer/'patElectrons'
[12] While reading from source std::vector<reco::Conversion> allConversions '' RECO
[13] Reading branch recoConversions_allConversions__RECO.
Additional Info:
[a] Fatal Root Error: @SUB=TStreamerInfo::BuildOld
Cannot convert reco::TrackBase::vertex_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag> to type: reco::storage::TrackPositionStorage, skip element
----- End Fatal Exception -------------------------------------------------
I guess ioread rules didn't make it to the PR yet.
So from reading various log files from the failed RelVals, for all sampled logs the failure was caused by reading an only file containing a TrackBase::vertex_ stored in the old way. As I don't have any iorule in the PR to handle the backwards reading, that is why we get the failures in those jobs.
@pcanal could you suggest what would be the proper IORules to craft to handle the conversion from the old
ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag>
to the new
reco::storage::TrackPositionStorage
where the new class has an identical structure (member types and member names) as the original? (The only difference is explicit storage comments.)
@pcanal I appear to have crafted iorules that allow the jobs to run. It is weird that I basically say that to read a value of class A you must call the default operator= for class A.
please test
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39554/32368
- This PR adds an extra 16KB to repository
Pull request #39554 was updated. @mandrenguyen, @clacaputo can you please check and sign again.
-1
Failed Tests: UnitTests RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7f2e52/27940/summary.html
COMMIT: e170bf756dd0948fe22f7dcf17a97083aaaf1337
CMSSW: CMSSW_12_6_X_2022-10-03-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39554/27940/install.sh to create a dev area with all the needed externals and cmssw changes.
Unit Tests
I found errors in the following unit tests:
---> test createDBObjecs had ERRORS ---> test checkMultiRunHarvestingOutput had ERRORS
RelVals
- 136.88811
136.88811_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL/step2_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL.log - 136.8311
136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log - 136.7611
136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log
RelVals-INPUT
- 136.72411
136.72411_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMINIAOD_data2016UL_HIPM+HARVESTDR2_REMINIAOD_data2016UL_HIPM/step2_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMINIAOD_data2016UL_HIPM+HARVESTDR2_REMINIAOD_data2016UL_HIPM.log - 136.72412
136.72412_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM/step2_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM.log - 136.7611
136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log
Expand to see more relval errors ...
- 136.76111
- 136.77211
- 136.7721
- 136.8311
- 136.83111
- 136.88811
- 140.5611
- 1325.5
- 1325.5161
- 25200.0
- 25202.15
- 25203.0
- 25214.0
- 50200.0
- 158.01
- 1307.0
- 1308.0
- 1311.0
- 1314.0
- 1315.0
- 1318.0
- 1319.0
- 1320.0
- 1321.0
- 1322.0
- 1323.0
- 1325.0
- 1325.3
- 1325.516
- 1325.518
- 1325.61
- 1325.8
- 1325.9
- 1326.0
- 1328.0
- 1329.0
- 1332.0
- 1335.0
- 1343.0
- 1344.0
- 1345.0
- 1348.0
- 1349.0
- 1351.0
- 1352.0
- 1364.0
- 139902.0
- 13992502.0
- 200.0
- 25202.0
- 25202.1
- 25204.0
- 25205.0
- 25209.0
- 50202.0
- 50204.0
I appear to have crafted iorules that allow the jobs to run. It is weird that I basically say that to read a value of class A you must call the default operator= for class A.
Apriori you could have used:
<ioread
sourceClass="ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<Double32_t>,ROOT::Math::DefaultCoordinateSystemTag>"
targetClass="reco::storage::TrackPositionStorage"
version="[1-]"
></ioread>
Which "allows" the conversion from the source class to the target class in all circumstances where the old layout has a data member of the former type and the current/new layout has a member of the later type. (Note there is no code associated with this rule).
If there is case where this transformation needs to be flagged an error (I don't see any as it requires the user to intentionally set the new type in the C++ header file), then you indeed need to use the rule you have (where we do not yet support make the code part implicit).
@pcanal Thanks for the suggestion. I'll probably try that next as with the iorule I wrote, we are now getting segmentation faults caused, presumably, by either
- the iorule writing outside of memory or
- the iorule returning values that are not what was actually stored which leads to downstream code to fail.
Yes, it could help. The class 'renaming' io-rule is more sturdy that the data member specific rules as the former do not require the caching (on the input as-is) that the later do require.
So I looked at the source of the crashes in the RelVal test. The problem is here
https://github.com/cms-sw/cmssw/blob/cf69e99a4228223afb366e416569a27d083dd227/DataFormats/PatCandidates/interface/PackedCandidate.h#L759
The value for pvRefKey_ is the largest possible int which causes the code to read way off the end of the RefVector pvRefProd_. This value could come from the default constructors value here
https://github.com/cms-sw/cmssw/blob/cf69e99a4228223afb366e416569a27d083dd227/DataFormats/PatCandidates/interface/PackedCandidate.h#L66
So something about the construction of this pat::PackedCandidate doesn't like this change to the tracks (where the tracks were stored in the old format and the ROOT iorule should have been run to fill them).
This is consistent with the rules not been scheduled/applied properly (i.e. here properly not running)(and most likely one of the currently-being-worked-on issues). Using the renaming rule ought to work around this.
Using the renaming rule ought to work around this.
The job did succeed using the renaming. I've updated this PR with the change.
+code-checks
Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39554/32435
- This PR adds an extra 16KB to repository
Pull request #39554 was updated. @cmsbuild, @mandrenguyen, @clacaputo can you please check and sign again.
please test
Abort
please test
+1
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7f2e52/28059/summary.html
COMMIT: 4347d97924720f18f6291231dd180865aae97f00
CMSSW: CMSSW_12_6_X_2022-10-05-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39554/28059/install.sh to create a dev area with all the needed externals and cmssw changes.
Comparison Summary
@slava77 comparisons for the following workflows were not done due to missing matrix map:
- /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-7f2e52/41834.0_TTbar_14TeV+2026D94+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal
Summary:
- No significant changes to the logs found
- Reco comparison results: 550 differences found in the comparisons
- DQMHistoTests: Total files compared: 49
- DQMHistoTests: Total histograms compared: 3391103
- DQMHistoTests: Total failures: 4078
- DQMHistoTests: Total nulls: 0
- DQMHistoTests: Total successes: 3387003
- DQMHistoTests: Total skipped: 22
- DQMHistoTests: Total Missing objects: 0
- DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
- Checked 204 log files, 49 edm output root files, 49 DQM output files
- TriggerResults: no differences found
type tracking