L1TMuonEndCapTrackProducer::produce() takes 96 MB memory per stream
Live memory profiles of https://github.com/cms-sw/cmssw/issues/40437#issuecomment-1630699980 show that L1TMuonEndCapTrackProducer::produce() takes 100 MB memory / stream
- 1 thread and stream https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/issue40437/reco_07.10_live/240
- 4 threads and streams https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/issue40437/reco_4th_4st_07.10_live/101
The L1TMuonEndCapTrackProducer module itself is an edm::stream. The memory consumption can be split into
- 93 MB / stream in
PtAssignmentEngine::load(), that seems to load some kind of BDT? - 1.8 MB / stream in
EMTFSetup::reload()loading some LUTs - 0.5 MB / stream in
SectorProcessor::process()that is mostly Tensorflow
Assuming the PtAssignmentEngine::load() is indeed BDT or similar, does its representation really need to be that large? Ideally all of these would be in GlobalCache of the module (e.g. Tensorflow stuff), or in the EventSetup (e.g. the BDT whose content apparently depends on the L1TMuonEndCapParams and L1TMuonEndCapForest EventSetup data products).
assign l1
New categories assigned: l1
@epalencia,@aloeliger you have been requested to review this Pull request/Issue and eventually sign? Thanks
A new Issue was created by @makortel Matti Kortelainen.
@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
Moving the Tensorflow stuff to GlobalCache was also discussed in https://github.com/cms-sw/cmssw/issues/32894
Hi @makortel , as you said the main contribution here is loading of the BDT in PtAssignmentEngine::load and coordinate conversion LUTs in SectorProcessorLUT::read. I don't know how we can reduce these easily.
Regarding the Tensorflow stuff and GlobalCache, I currently don't have the time to rework the code unfortunately. If you and/or L1T offline software think that this should be done, I can try to pass this task to someone within the EMTF group and we can see how quickly we can implement this.
Hi @eyigitba I think addressing the PtAssignmentEngine::load() (at least making read-only parts shared across streams; I hope the memory there is mostly read-only) would be important.
The Tensorflow part would be nice (e.g. if the model would become larger in the future, or serving as an example for others), but with 0.5 MB / stream not that important today.
Hi @makortel , ok we'll look into the BDT loading and also see how we can improve the Tensorflow part.
How urgent is this btw?
Memory budget is 2GB per stream (including shared component and I/O buffers).
L1TMuonEndCapTrackProducer alone accounts for 5% of that.
To L1 mngmt to judge how urgent it WAS.
BTW: why using your own implementation of a Forest and not the highly optimized common CMS one https://cmssdt.cern.ch/dxr/CMSSW/source/CondFormats/GBRForest/interface/GBRForest.h#24 ?
I would advice to switch to that.
Thanks for the advice @VinInn . I don't know why it was implemented like this, but this code is quite old, from 2016 or so. I unfortunately don't have much time these couple of weeks, but we'll discuss in the EMTF group to come up with a solution soon.
The PtAssignmentEngine::load() came up again in https://github.com/cms-sw/cmssw/issues/46040#issuecomment-2420384665, this time with ~47 MB / stream cost. On an 8-thread/stream PromptReco that corresponds to 376 MB. Avoiding the per-stream copies could potentially save ~329 MB, and moving to GBRForest would likely save more.
In my opinion it would be great if this would be addressed for 2025 data taking.
cms-bot internal usage
type performance-improvements
@eyigitba @cms-sw/l1-l2 Any news?
@makortel , not yet from my side. I have a student looking into this but nothing to discuss for now
Came up in https://github.com/cms-sw/cmssw/issues/46975#issuecomment-2557671051
@eyigitba @cms-sw/l1-l2 A kind reminder (this came up last yer when we investigated PromptReco jobs using too much memory)
Hi @makortel , we have someone that'll look into this but they are currently busy with another task. I hope they'll start the work on this soon
Great!