cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

ROOT IORule for std::auto_ptr -> std::unique_ptr does not work with high split level [ROOT]

Open Dr15Jones opened this issue 1 year ago • 23 comments

The iorule in SimDataFormats/GeneratorProducts which deals with converting files written using std::auto_ptr<gen::PdfInfo> but read with releases using std::unique_ptr<gen::PdfInfo> does not work if the file was written using split level of 99. It does work if the file was written using split level 0.

In both cases, the iorule is executed, but in the high split level case the address returned from std::auto_ptr<>::get is always null.

Dr15Jones avatar Feb 08 '24 22:02 Dr15Jones

assign core, root

Dr15Jones avatar Feb 08 '24 22:02 Dr15Jones

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild avatar Feb 08 '24 22:02 cmsbuild

cms-bot internal usage

cmsbuild avatar Feb 08 '24 22:02 cmsbuild

A new Issue was created by @Dr15Jones Chris Jones.

@Dr15Jones, @rappoccio, @smuzaffar, @makortel, @sextonkennedy, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

cmsbuild avatar Feb 08 '24 22:02 cmsbuild

The iorule in question are

https://github.com/cms-sw/cmssw/blob/4a6d7e831d09c01a92d2381c319f1424a524e651/SimDataFormats/GeneratorProducts/src/classes_def.xml#L177-L179

https://github.com/cms-sw/cmssw/blob/4a6d7e831d09c01a92d2381c319f1424a524e651/SimDataFormats/GeneratorProducts/src/classes_def.xml#L220-L222

Dr15Jones avatar Feb 08 '24 22:02 Dr15Jones

@pcanal FYI

Dr15Jones avatar Feb 08 '24 22:02 Dr15Jones

type root

makortel avatar Feb 08 '24 22:02 makortel

assign xpog because of the interest in reading run2 legacy AODSIM

vlimant avatar May 21 '24 08:05 vlimant

New categories assigned: xpog

@vlimant,@hqucms you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild avatar May 21 '24 08:05 cmsbuild

in the context of EOY24 reprocessing, this needs some attention. what are cms options on this matter ?

vlimant avatar May 31 '24 07:05 vlimant

Ideally we'd have the iorule behavior fixed in ROOT.

Thinking about workarounds, I guess in principle we could run a simple conversion job using 10_6_X first that reads the high-split-level AODSIM file and produces a splitlevel-0 file, that is then read by the first 15_0_X processing step.

makortel avatar Jun 01 '24 00:06 makortel

Hi @makortel , can you please test and share one such configuration ?

vlimant avatar Jun 03 '24 08:06 vlimant

I'd expect something along the following to do the job

import FWCore.ParameterSet.Config as cms

process = cms.Process("ConvertToNonsplit")
process.source = cms.Source("PoolSource",
    fileNames = cms.untracked.vstring("<inputfile>")
)
process.out = cms.OutputModule("PoolOutputModule",
    fileName = cms.untracked.string("<outputfile>"),
    splitLevel = cms.untracked.int32(0),
    overrideInputFileSplitLevels = cms.untracked.bool(True)
)
process.ep = cms.EndPath(process.out)

but I'd leave the testing to others.

makortel avatar Jun 03 '24 14:06 makortel

but I'd leave the testing to others.

I could also rephrase: if someone crafts a test (that gets run in IBs) that checks the std::{auto,unique}_ptr<gen::PdfInfo> is read correctly (including the contents of gen::PdfInfo are read correctly) and thus demonstrates the problem, core can add the aforementioned workaround to that test.

makortel avatar Jun 05 '24 21:06 makortel

so you'd like a test, that reads the old file with the old release, dumps the value somehow. reads the old file with the new release (and the workaround), dumps the value ; and make a comparison of the two. I don't think we really need this to run in the IB, but rather a one-of test that https://github.com/cms-sw/cmssw/issues/43923#issuecomment-2145403222 would do

vlimant avatar Jun 14 '24 08:06 vlimant

looks like there is no issue in the end : https://github.com/cms-sw/cmssw/issues/43422#issuecomment-2167752987 can you please show me again what is supposed to fail, because reading old 10.6 AODSIM to produce mini and nano in recent (14.1 IB) does seem to function.

vlimant avatar Jun 28 '24 20:06 vlimant

looks like there is no issue in the end : #43422 (comment)

I checked the file /eos/cms/store/mc/RunIISummer20UL18RECO/TTToHadronic_TuneCP5_RTT_13TeV-powheg-pythia8/AODSIM/106X_upgrade2018_realistic_v11_L1v1-v2/60009/F4A0C48A-4DC4-7D4C-BD6C-94F9492FAE97.root, and the branch

GenEventInfoProduct                   "generator"                 ""                "GEN"

is not split (i.e. split level is 0), which then explains why the read rule works.

We were probably interpreting the statement "AOD and MiniAOD are stored with split level 99" too literally. What the statement really means is that the data products created by the job writing AOD/MiniAOD are written with split level 99, but the GenEventInfoProduct data product is copied from the input, and by default we preserve the split level of data products from the input file.

If all relevant 10_6_X AODSIM and MiniAODSIM datasets have the same property (i.e. GenEventInfoProduct is stored by an earlier process in the chain that explicitly set the output split level to 0, and the PoolOutputModule's overrideInputFileSplitLevels parameter is never set to True by any later step in the chain), then I guess there is no issue here in practice for the planned Re-Mini+Re-Nano of these datasets.

makortel avatar Jul 08 '24 20:07 makortel

I don't think we really need this to run in the IB, but rather a one-of test that #43923 (comment) would do

Unfortunately one-off test is not sufficient to guarantee the present functionality stays functional with new code changes (I'm mostly worried about ROOT). Even if the reading seems to work in practice with the 10_6_X files, I'd recommend to craft a test that ensures that capability.

makortel avatar Jul 08 '24 20:07 makortel

thanks @makortel . We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with split level 0

vlimant avatar Jul 09 '24 08:07 vlimant

We would need to find a file with that branch with high split level, than use it as input. Not sure how to find one such 10.6 AODSIM dataset ; chances are they are all with split level 0

So how worried are we about a 10.6 AODSIM dataset where the GEN information was stored with non-zero split level? In the leading order I'd also guess they would have all been produced with the same output settings, but who knows.

makortel avatar Sep 05 '24 18:09 makortel

@cms-sw/xpog-l2 are less worried about this indeed

vlimant avatar Sep 06 '24 09:09 vlimant

https://github.com/root-project/root/pull/18320 should help with the auto_ptr IO rules, and it was included in ROOT 6.36. https://github.com/jblomer/auto_ptr_demo has an example how to use it. We should give it a try in the ROOT636 IB.

makortel avatar May 21 '25 19:05 makortel