cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

Expensive allocations of a `std::vector<unsigned int>` in `CellularAutomaton::createAndConnectCells`

Open dpiparo opened this issue 3 years ago • 11 comments

This report seems to hint that CellularAutomaton::createAndConnectCells is responsible for heavy allocations triggered by the resize of a std::vector<unsigned int>. I cannot really pinpoint where this happens, but it would be nice to get rid of this 0.2%. Regarding the same method, perhaps using a std::bitset might be preferrable to the std::vector<bool>.

dpiparo avatar Apr 27 '22 09:04 dpiparo

A new Issue was created by @dpiparo Danilo Piparo.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

cmsbuild avatar Apr 27 '22 09:04 cmsbuild

assign reconstruction

FYI @cms-sw/tracking-pog-l2

makortel avatar Apr 27 '22 13:04 makortel

New categories assigned: reconstruction

@jpata,@slava77,@clacaputo you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild avatar Apr 27 '22 13:04 cmsbuild

type performance-improvements

jpata avatar Apr 28 '22 13:04 jpata

type tracking

jpata avatar Apr 29 '22 13:04 jpata

(I started looking into this - just to prevent potential overlap)

jpata avatar Apr 29 '22 13:04 jpata

This is old CPU code. It should be replaced by the new GPU implementation

VinInn avatar Apr 30 '22 13:04 VinInn

@AdrianoDee maybe do you know how to tackle this?

dpiparo avatar Jun 29 '22 15:06 dpiparo

Hi @dpiparo, @jpata, indeed I may have some insight but I'm not sure how much this could help. I think the place where these allocations happens is

https://github.com/cms-sw/cmssw/blob/6d2f66057131baacc2fcbdd203588c41c885b42c/RecoPixelVertexing/PixelTriplets/src/CellularAutomaton.cc#L59-L61

for the CAHitQuadrupletGenerator

and here

https://github.com/cms-sw/cmssw/blob/6d2f66057131baacc2fcbdd203588c41c885b42c/RecoPixelVertexing/PixelTriplets/src/CellularAutomaton.cc#L195-L197

for the CAHitTripletGenerator, where the vector keeping the index of the cells an hit is outer cell of gets expanded. For Phase II these sizes are not negligible (while for Run3 they are). See e.g.

ca_sizes

So here https://github.com/cms-sw/cmssw/compare/CMSSW_12_4_X...AdrianoDee:cmssw:fix_to_ca_124 I had added a parameter (CAcellsPerOuterHit ) to reserve the maximum size of the vector when the CALayer is created (changing it for PhaseII/Run3 with the era modifiers). My preliminary results (run on a local machine) seems to show an improvement in the allocations but do not bring an improvement in general for createAndConnectCells. See:

  • Profiling for 34834.21 as it is here.
  • Profiling for 34834.21 with the reserve here.

Next I need to do three things: u

  • se a proper machine for timing (e.g. vocms011)
  • moving the vector to a fixed size array
  • or checking the sizes for the different steps and change the parameter accordingly (for some of them, note the two peak, the reserve would most probably be oversizes)

Even though on the medium/long term I tend to agree with Vincenzo about having it replaced with the GPU version.

AdrianoDee avatar Jul 01 '22 08:07 AdrianoDee

This issue came up in https://github.com/cms-sw/cmssw/issues/46040#issuecomment-2420404065 . These allocations are about 2.6 % of all allocations within the event processing in a Run 3 prompt reco job.

(@cms-sw/reconstruction-l2 @cms-sw/tracking-pog-l2 to refresh recipients)

makortel avatar Oct 21 '24 20:10 makortel

cms-bot internal usage

cmsbuild avatar Oct 21 '24 20:10 cmsbuild