cmssw L1TMuonEndCapTrackProducer::produce() takes 96 MB memory per stream

Live memory profiles of https://github.com/cms-sw/cmssw/issues/40437#issuecomment-1630699980 show that L1TMuonEndCapTrackProducer::produce() takes 100 MB memory / stream

1 thread and stream https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/issue40437/reco_07.10_live/240
4 threads and streams https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/issue40437/reco_4th_4st_07.10_live/101

The L1TMuonEndCapTrackProducer module itself is an edm::stream. The memory consumption can be split into

93 MB / stream in PtAssignmentEngine::load(), that seems to load some kind of BDT?
1.8 MB / stream in EMTFSetup::reload() loading some LUTs
0.5 MB / stream in SectorProcessor::process() that is mostly Tensorflow

Assuming the PtAssignmentEngine::load() is indeed BDT or similar, does its representation really need to be that large? Ideally all of these would be in GlobalCache of the module (e.g. Tensorflow stuff), or in the EventSetup (e.g. the BDT whose content apparently depends on the L1TMuonEndCapParams and L1TMuonEndCapForest EventSetup data products).

Aug 09 '23 16:08 makortel

assign l1

Aug 09 '23 16:08 makortel

New categories assigned: l1

@epalencia,@aloeliger you have been requested to review this Pull request/Issue and eventually sign? Thanks

Aug 09 '23 16:08 cmsbuild

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

Aug 09 '23 16:08 cmsbuild

Moving the Tensorflow stuff to GlobalCache was also discussed in https://github.com/cms-sw/cmssw/issues/32894

Aug 09 '23 16:08 makortel

Hi @makortel , as you said the main contribution here is loading of the BDT in PtAssignmentEngine::load and coordinate conversion LUTs in SectorProcessorLUT::read. I don't know how we can reduce these easily.

Regarding the Tensorflow stuff and GlobalCache, I currently don't have the time to rework the code unfortunately. If you and/or L1T offline software think that this should be done, I can try to pass this task to someone within the EMTF group and we can see how quickly we can implement this.

Aug 09 '23 17:08 eyigitba

Hi @eyigitba I think addressing the PtAssignmentEngine::load() (at least making read-only parts shared across streams; I hope the memory there is mostly read-only) would be important.

The Tensorflow part would be nice (e.g. if the model would become larger in the future, or serving as an example for others), but with 0.5 MB / stream not that important today.

Aug 09 '23 17:08 makortel

Hi @makortel , ok we'll look into the BDT loading and also see how we can improve the Tensorflow part.

How urgent is this btw?

Aug 09 '23 18:08 eyigitba

Memory budget is 2GB per stream (including shared component and I/O buffers). L1TMuonEndCapTrackProducer alone accounts for 5% of that. To L1 mngmt to judge how urgent it WAS.

Aug 10 '23 09:08 VinInn

BTW: why using your own implementation of a Forest and not the highly optimized common CMS one https://cmssdt.cern.ch/dxr/CMSSW/source/CondFormats/GBRForest/interface/GBRForest.h#24 ?

I would advice to switch to that.

Aug 10 '23 09:08 VinInn

Thanks for the advice @VinInn . I don't know why it was implemented like this, but this code is quite old, from 2016 or so. I unfortunately don't have much time these couple of weeks, but we'll discuss in the EMTF group to come up with a solution soon.

Aug 10 '23 10:08 eyigitba

The PtAssignmentEngine::load() came up again in https://github.com/cms-sw/cmssw/issues/46040#issuecomment-2420384665, this time with ~47 MB / stream cost. On an 8-thread/stream PromptReco that corresponds to 376 MB. Avoiding the per-stream copies could potentially save ~329 MB, and moving to GBRForest would likely save more.

In my opinion it would be great if this would be addressed for 2025 data taking.

Oct 17 '24 20:10 makortel

cms-bot internal usage

Oct 17 '24 20:10 cmsbuild

type performance-improvements

Oct 17 '24 20:10 makortel

@eyigitba @cms-sw/l1-l2 Any news?

Dec 09 '24 14:12 makortel

@makortel , not yet from my side. I have a student looking into this but nothing to discuss for now

Dec 11 '24 10:12 eyigitba

Came up in https://github.com/cms-sw/cmssw/issues/46975#issuecomment-2557671051

Dec 20 '24 20:12 makortel

@eyigitba @cms-sw/l1-l2 A kind reminder (this came up last yer when we investigated PromptReco jobs using too much memory)

Mar 28 '25 13:03 makortel

Hi @makortel , we have someone that'll look into this but they are currently busy with another task. I hope they'll start the work on this soon

Mar 28 '25 14:03 eyigitba

Great!

Mar 28 '25 14:03 makortel