ROOT IORule for std::auto_ptr -> std::unique_ptr does not work with high split level [ROOT]
The iorule in SimDataFormats/GeneratorProducts which deals with converting files written using std::auto_ptr<gen::PdfInfo> but read with releases using std::unique_ptr<gen::PdfInfo> does not work if the file was written using split level of 99. It does work if the file was written using split level 0.
In both cases, the iorule is executed, but in the high split level case the address returned from std::auto_ptr<>::get is always null.
assign core, root
New categories assigned: core
@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks
cms-bot internal usage
A new Issue was created by @Dr15Jones Chris Jones.
@Dr15Jones, @rappoccio, @smuzaffar, @makortel, @sextonkennedy, @antoniovilela can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
The iorule in question are
https://github.com/cms-sw/cmssw/blob/4a6d7e831d09c01a92d2381c319f1424a524e651/SimDataFormats/GeneratorProducts/src/classes_def.xml#L177-L179
https://github.com/cms-sw/cmssw/blob/4a6d7e831d09c01a92d2381c319f1424a524e651/SimDataFormats/GeneratorProducts/src/classes_def.xml#L220-L222
@pcanal FYI
type root
assign xpog because of the interest in reading run2 legacy AODSIM
New categories assigned: xpog
@vlimant,@hqucms you have been requested to review this Pull request/Issue and eventually sign? Thanks
in the context of EOY24 reprocessing, this needs some attention. what are cms options on this matter ?
Ideally we'd have the iorule behavior fixed in ROOT.
Thinking about workarounds, I guess in principle we could run a simple conversion job using 10_6_X first that reads the high-split-level AODSIM file and produces a splitlevel-0 file, that is then read by the first 15_0_X processing step.
Hi @makortel , can you please test and share one such configuration ?
I'd expect something along the following to do the job
import FWCore.ParameterSet.Config as cms
process = cms.Process("ConvertToNonsplit")
process.source = cms.Source("PoolSource",
fileNames = cms.untracked.vstring("<inputfile>")
)
process.out = cms.OutputModule("PoolOutputModule",
fileName = cms.untracked.string("<outputfile>"),
splitLevel = cms.untracked.int32(0),
overrideInputFileSplitLevels = cms.untracked.bool(True)
)
process.ep = cms.EndPath(process.out)
but I'd leave the testing to others.
but I'd leave the testing to others.
I could also rephrase: if someone crafts a test (that gets run in IBs) that checks the std::{auto,unique}_ptr<gen::PdfInfo> is read correctly (including the contents of gen::PdfInfo are read correctly) and thus demonstrates the problem, core can add the aforementioned workaround to that test.
so you'd like a test, that reads the old file with the old release, dumps the value somehow. reads the old file with the new release (and the workaround), dumps the value ; and make a comparison of the two. I don't think we really need this to run in the IB, but rather a one-of test that https://github.com/cms-sw/cmssw/issues/43923#issuecomment-2145403222 would do
looks like there is no issue in the end : https://github.com/cms-sw/cmssw/issues/43422#issuecomment-2167752987 can you please show me again what is supposed to fail, because reading old 10.6 AODSIM to produce mini and nano in recent (14.1 IB) does seem to function.
looks like there is no issue in the end : #43422 (comment)
I checked the file /eos/cms/store/mc/RunIISummer20UL18RECO/TTToHadronic_TuneCP5_RTT_13TeV-powheg-pythia8/AODSIM/106X_upgrade2018_realistic_v11_L1v1-v2/60009/F4A0C48A-4DC4-7D4C-BD6C-94F9492FAE97.root, and the branch
GenEventInfoProduct "generator" "" "GEN"
is not split (i.e. split level is 0), which then explains why the read rule works.
We were probably interpreting the statement "AOD and MiniAOD are stored with split level 99" too literally. What the statement really means is that the data products created by the job writing AOD/MiniAOD are written with split level 99, but the GenEventInfoProduct data product is copied from the input, and by default we preserve the split level of data products from the input file.
If all relevant 10_6_X AODSIM and MiniAODSIM datasets have the same property (i.e. GenEventInfoProduct is stored by an earlier process in the chain that explicitly set the output split level to 0, and the PoolOutputModule's overrideInputFileSplitLevels parameter is never set to True by any later step in the chain), then I guess there is no issue here in practice for the planned Re-Mini+Re-Nano of these datasets.
I don't think we really need this to run in the IB, but rather a one-of test that #43923 (comment) would do
Unfortunately one-off test is not sufficient to guarantee the present functionality stays functional with new code changes (I'm mostly worried about ROOT). Even if the reading seems to work in practice with the 10_6_X files, I'd recommend to craft a test that ensures that capability.
thanks @makortel . We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with split level 0
We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with
split level 0
So how worried are we about a 10.6 AODSIM dataset where the GEN information was stored with non-zero split level? In the leading order I'd also guess they would have all been produced with the same output settings, but who knows.
@cms-sw/xpog-l2 are less worried about this indeed
https://github.com/root-project/root/pull/18320 should help with the auto_ptr IO rules, and it was included in ROOT 6.36. https://github.com/jblomer/auto_ptr_demo has an example how to use it. We should give it a try in the ROOT636 IB.