cmssw icon indicating copy to clipboard operation
cmssw copied to clipboard

Problems seen in HGCal mixing

Open Dr15Jones opened this issue 7 months ago • 31 comments

Running valgrind on workflow 34034.0 step 2 has uncovered many issues in the mixing of HGCal data.

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

cms-bot internal usage

cmsbuild avatar Apr 28 '25 19:04 cmsbuild

A new Issue was created by @Dr15Jones.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

cmsbuild avatar Apr 28 '25 19:04 cmsbuild

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB13E2F7: HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:669)
==1839746==    by 0xBB13BBBD: HGCDigitizer::accumulate(edm::Event const&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:371)
==1839746==    by 0xBB12F55B: HGCDigiProducer::accumulate(edm::Event const&, edm::EventSetup const&) (HGCDigiProducer.cc:46)
==1839746==    by 0xB606E94A: edm::MixingModule::accumulateEvent(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB606E97E: edm::MixingModule::addSignals(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D738A: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB13E190: HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:660)
==1839746==    by 0xBB13BBBD: HGCDigitizer::accumulate(edm::Event const&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:371)
==1839746==    by 0xBB12F55B: HGCDigiProducer::accumulate(edm::Event const&, edm::EventSetup const&) (HGCDigiProducer.cc:46)
==1839746==    by 0xB606E94A: edm::MixingModule::accumulateEvent(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB606E97E: edm::MixingModule::addSignals(edm::Event const&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D738A: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Invalid read of size 4
==1839746==    at 0xBB15CDDF: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:175)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

==1839746==  Address 0x8c2fd25c is 0 bytes after a block of size 12 alloc'd
==1839746==    at 0x403BEE1: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/external/valgrind/3.24.0-f85a1303334507f502ed8242a93c05bd/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1839746==    by 0xBB12718D: std::__new_allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t>::allocate(unsigned long, void const*) (new_allocator.h:137)
==1839746==    by 0xBB125EF0: UnknownInlinedFun (allocator.h:188)
==1839746==    by 0xBB125EF0: std::allocator_traits<std::allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t> >::allocate(std::allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t>&, unsigned long) (alloc_traits.h:464)
==1839746==    by 0xBB125E63: std::_Vector_base<float, std::allocator<float> >::_M_allocate(unsigned long) (stl_vector.h:378)
==1839746==    by 0xBB15E916: void std::vector<float, std::allocator<float> >::_M_range_initialize<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, std::forward_iterator_tag) (stl_vector.h:1687)
==1839746==    by 0xBB15DB1A: std::vector<float, std::allocator<float> >::vector<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, void>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, std::allocator<float> const&) (stl_vector.h:706)
==1839746==    by 0xBB15B7FB: HGCDigitizerBase::HGCDigitizerBase(edm::ParameterSet const&) (HGCDigitizerBase.cc:39)
==1839746==    by 0xBB162262: HGCEEDigitizer::HGCEEDigitizer(edm::ParameterSet const&) (HGCEEDigitizer.cc:22)
==1839746==    by 0xBB1626AA: std::__detail::_MakeUniq<HGCEEDigitizer>::__single_object std::make_unique<HGCEEDigitizer, edm::ParameterSet const&>(edm::ParameterSet const&) (unique_ptr.h:1065)
==1839746==    by 0xBB162631: edmplugin::PluginFactory<HGCDigitizerBase* (edm::ParameterSet const&)>::PMaker<HGCEEDigitizer>::create(edm::ParameterSet const&) const (PluginFactory.h:57)
==1839746==    by 0xBB140E06: edmplugin::PluginFactory<HGCDigitizerBase* (edm::ParameterSet const&)>::create(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, edm::ParameterSet const&) const (PluginFactory.h:65)
==1839746==    by 0xBB13A825: HGCDigitizer::HGCDigitizer(edm::ParameterSet const&, edm::ConsumesCollector&) (HGCDigitizer.cc:261)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Invalid read of size 4
==1839746==    at 0xBB15CE94: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:180)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

==1839746==  Address 0x8c2fd25c is 0 bytes after a block of size 12 alloc'd
==1839746==    at 0x403BEE1: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/external/valgrind/3.24.0-f85a1303334507f502ed8242a93c05bd/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1839746==    by 0xBB12718D: std::__new_allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t>::allocate(unsigned long, void const*) (new_allocator.h:137)
==1839746==    by 0xBB125EF0: UnknownInlinedFun (allocator.h:188)
==1839746==    by 0xBB125EF0: std::allocator_traits<std::allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t> >::allocate(std::allocator<HGCalSiNoiseMap<HGCSiliconDetId>::GainRange_t>&, unsigned long) (alloc_traits.h:464)
==1839746==    by 0xBB125E63: std::_Vector_base<float, std::allocator<float> >::_M_allocate(unsigned long) (stl_vector.h:378)
==1839746==    by 0xBB15E916: void std::vector<float, std::allocator<float> >::_M_range_initialize<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, std::forward_iterator_tag) (stl_vector.h:1687)
==1839746==    by 0xBB15DB1A: std::vector<float, std::allocator<float> >::vector<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, void>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, std::allocator<float> const&) (stl_vector.h:706)
==1839746==    by 0xBB15B7FB: HGCDigitizerBase::HGCDigitizerBase(edm::ParameterSet const&) (HGCDigitizerBase.cc:39)
==1839746==    by 0xBB162262: HGCEEDigitizer::HGCEEDigitizer(edm::ParameterSet const&) (HGCEEDigitizer.cc:22)
==1839746==    by 0xBB1626AA: std::__detail::_MakeUniq<HGCEEDigitizer>::__single_object std::make_unique<HGCEEDigitizer, edm::ParameterSet const&>(edm::ParameterSet const&) (unique_ptr.h:1065)
==1839746==    by 0xBB162631: edmplugin::PluginFactory<HGCDigitizerBase* (edm::ParameterSet const&)>::PMaker<HGCEEDigitizer>::create(edm::ParameterSet const&) const (PluginFactory.h:57)
==1839746==    by 0xBB140E06: edmplugin::PluginFactory<HGCDigitizerBase* (edm::ParameterSet const&)>::create(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, edm::ParameterSet const&) const (PluginFactory.h:65)
==1839746==    by 0xBB13A825: HGCDigitizer::HGCDigitizer(edm::ParameterSet const&, edm::ConsumesCollector&) (HGCDigitizer.cc:261)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB15D1AF: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:203)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB197687: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:284)
==1839746==    by 0xBB15E044: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaper(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, std::array<float, 6ul> const&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float) (HGCFEElectronics.h:55)
==1839746==    by 0xBB15D2B4: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:211)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB194EC0: float const& std::min<float>(float const&, float const&) (stl_algobase.h:235)
==1839746==    by 0xBB198816: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:472)
==1839746==    by 0xBB15E044: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaper(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, std::array<float, 6ul> const&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float) (HGCFEElectronics.h:55)
==1839746==    by 0xBB15D2B4: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:211)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB15D3C8: HGCDigitizerBase::updateOutput(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, HGCDataFrame<DetId, HGCSample> const&) (HGCDigitizerBase.cc:240)
==1839746==    by 0xBB15D2D8: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:225)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB1973C6: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:256)
==1839746==    by 0xBB15E044: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaper(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, std::array<float, 6ul> const&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float) (HGCFEElectronics.h:55)
==1839746==    by 0xBB15D2B4: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:211)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

assign simulation

makortel avatar Apr 28 '25 19:04 makortel

FYI @cms-sw/hgcal-dpg-l2

makortel avatar Apr 28 '25 19:04 makortel

New categories assigned: simulation

@civanch,@kpedro88,@mdhildreth you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild avatar Apr 28 '25 19:04 cmsbuild

==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB19749D: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:260)
==1839746==    by 0xBB15E044: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaper(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, std::array<float, 6ul> const&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float) (HGCFEElectronics.h:55)
==1839746==    by 0xBB15D2B4: HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:211)
==1839746==    by 0xBB15C690: HGCDigitizerBase::run(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, unsigned int, CLHEP::HepRandomEngine*) (HGCDigitizerBase.cc:108)
==1839746==    by 0xBB13B4B5: HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (HGCDigitizer.cc:326)
==1839746==    by 0xBB12F4E0: HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) (HGCDigiProducer.cc:35)
==1839746==    by 0xB606C76A: edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so)
==1839746==    by 0xB60D7340: edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (in /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02886/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-04-25-2300/lib/el8_amd64_gcc12/libMixingBase.so)
==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB1974D8: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:261)
...
==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB1974E3: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:261)
...
==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB1975B2: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:266)
...
==1839746== Conditional jump or move depends on uninitialised value(s)
==1839746==    at 0xBB1975C1: HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) (HGCFEElectronics.cc:266)
...

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

So for the two read errors, the code is https://github.com/cms-sw/cmssw/blob/c83ecb127f70592a95b0544c2e7e2404c9d7e440/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizerBase.cc#L175

https://github.com/cms-sw/cmssw/blob/c83ecb127f70592a95b0544c2e7e2404c9d7e440/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizerBase.cc#L180

so in both cases cell.thickness - 1 must be equal to noise_fC_.size() (as both cases we are reading 1 beyond the end).

The value of noise_fC_ seems to come from the configuration https://github.com/cms-sw/cmssw/blob/c83ecb127f70592a95b0544c2e7e2404c9d7e440/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizerBase.cc#L30-L39

but it is also later passed to a function https://github.com/cms-sw/cmssw/blob/c83ecb127f70592a95b0544c2e7e2404c9d7e440/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizerBase.cc#L71

Looking at the mixing module configuration, there appear to be 3 HGCal plugins doing the digitization, of which only 2 have a value set in the configuration for noise_fC

hgceeDigitizer and hgchefrontDigitizer are set with

noise_fC = cms.PSet(
    refToPSet_ = cms.string('HGCAL_noise_fC')
),

while hgchebackDigitizer has no setting. The value in the referenced PSet is

>>> print(process.HGCAL_noise_fC)
cms.PSet(
    doseMap = cms.string(''),
    scaleByDose = cms.bool(False),
    scaleByDoseAlgo = cms.uint32(0),
    scaleByDoseFactor = cms.double(1),
    values = cms.vdouble(0.32041011999999996, 0.384492144, 0.32041011999999996)
)

so there are 3 values. Given that the type of the variable is

https://github.com/cms-sw/cmssw/blob/c83ecb127f70592a95b0544c2e7e2404c9d7e440/SimCalorimetry/HGCalSimProducers/interface/HGCDigitizerBase.h#L122

and the allocation size in the valgrind report is 12 (where sizeof(float)*3 == 12) it seems reasonable that the value of cell.thickness - 1 is 3 (i.e. cell.thickness == 4) which would be 1 beyond the size of a std::vector<float> holding 3 values.

Dr15Jones avatar Apr 28 '25 19:04 Dr15Jones

I took a look at the possible origin of the first group of accessing uninitialized data. Each line has code containing

tdcForToAOnset[waferThickness - 1]

where

int waferThickness = getCellThickness(geom, id); 

and

auto tdcForToAOnset = theDigitizer_->tdcForToAOnset(); 

and tdcForToAOnset is a std::array<float,3> which is living on the stack. I added the following assert in strategic places in the code

assert(waferThickness < static_cast<int>(tdcForToAOnset.size()+1));

and that asserting failed. So it looks like this is also problem stemming from the thickness being larger than the sizes of the structure.

Dr15Jones avatar Apr 29 '25 15:04 Dr15Jones

assign SimCalorimetry/HGCalSimProducers

Dr15Jones avatar Apr 29 '25 15:04 Dr15Jones

New categories assigned: upgrade

@Moanwar,@srimanob,@subirsarkar you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild avatar Apr 29 '25 15:04 cmsbuild

To aid in further debugging, I added a printout after the call to

int waferThickness = getCellThickness(geom, id); 

if waferThickness > 3 and give the id. The one I got was

waferThickness 4 id 2349896814

Dr15Jones avatar Apr 29 '25 15:04 Dr15Jones

Thanks for this I'm investigating - it's taking a bit long. I hope to have news within the next week.

pfs avatar May 07 '25 06:05 pfs

tagging @jbsauvan as it may also impact L1 emulation

pfs avatar May 07 '25 06:05 pfs

Related to the first group, we are getting hits in both ASAN and UBSAN for wf 34034.0:

ASAN:

==2248829==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x14556f47910c at pc 0x145511e74614 bp 0x14556f478050 sp 0x14556f478048
READ of size 4 at 0x14556f47910c thread T3
    #0 0x145511e74613 in HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so+0x8d613)
    #1 0x145511e74f2a in HGCDigitizer::accumulate(edm::Event const&, edm::EventSetup const&, CLHEP::HepRandomEngine*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so+0x8df2a)
    #2 0x14551332f956 in edm::MixingModule::accumulateEvent(edm::Event const&, edm::EventSetup const&) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so+0x12f956)
    #3 0x14551332faed in edm::MixingModule::addSignals(edm::Event const&, edm::EventSetup const&) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so+0x12faed)
    #4 0x1455130b9b02 in edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/libMixingBase.so+0x58b02)
    #5 0x1455cec64055 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/libFWCoreFramework.so+0xa64055)

Address 0x14556f47910c is located in stack of thread T3 at offset 3964 in frame
    #0 0x145511e6da6f in HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*) (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so+0x86a6f)

  This frame has 139 object(s):
    [48, 49) '<unknown>'
    [64, 65) '<unknown>'
    [80, 81) '<unknown>'
    [96, 97) '<unknown>'
    [112, 113) '<unknown>'
    [128, 129) '<unknown>'
    [144, 148) 'id' (line 560)
    [160, 164) '<unknown>'
    [176, 180) '<unknown>'
    [192, 196) 'id' (line 577)
    [208, 212) '<unknown>'
    [224, 228) '<unknown>'
    [240, 248) 'simHitIt' (line 583)
    [272, 280) 'findPos' (line 621)
    [304, 312) '<unknown>'
    [336, 344) 'insertedPos' (line 627)
    [368, 376) '<unknown>'
    [400, 408) '<unknown>'
    [432, 440) '<unknown>'
    [464, 472) '<unknown>'
    [496, 504) 'step' (line 641)
    [528, 536) '<unknown>'
    [560, 568) '<unknown>'
    [592, 600) '<unknown>'
    [624, 632) '<unknown>'
    [656, 664) 'stepEnd' (line 651)
    [688, 696) '<unknown>'
    [720, 728) '<unknown>'
    [752, 760) '<unknown>'
    [784, 792) '<unknown>'
    [816, 824) '<unknown>'
    [848, 856) '<unknown>'
    [880, 888) '<unknown>'
    [912, 920) '<unknown>'
    [944, 952) '__it'
    [976, 984) '<unknown>'
    [1008, 1016) '<unknown>'
    [1040, 1048) '<unknown>'
    [1072, 1080) '<unknown>'
    [1104, 1112) '<unknown>'
    [1136, 1144) '<unknown>'
    [1168, 1176) '<unknown>'
    [1200, 1208) '<unknown>'
    [1232, 1240) '<unknown>'
    [1264, 1272) '<unknown>'
    [1296, 1304) '<unknown>'
    [1328, 1336) '<unknown>'
    [1360, 1368) '<unknown>'
    [1392, 1400) '<unknown>'
    [1424, 1432) '<unknown>'
    [1456, 1464) '<unknown>'
    [1488, 1496) '<unknown>'
    [1520, 1528) '<unknown>'
    [1552, 1560) '<unknown>'
    [1584, 1592) '<unknown>'
    [1616, 1624) '<unknown>'
    [1648, 1656) '<unknown>'
    [1680, 1688) '<unknown>'
    [1712, 1720) '<unknown>'
    [1744, 1752) '<unknown>'
    [1776, 1784) '<unknown>'
    [1808, 1816) '<unknown>'
    [1840, 1848) '<unknown>'
    [1872, 1880) '<unknown>'
    [1904, 1912) '<unknown>'
    [1936, 1944) '<unknown>'
    [1968, 1976) '<unknown>'
    [2000, 2008) '<unknown>'
    [2032, 2040) '<unknown>'
    [2064, 2072) '<unknown>'
    [2096, 2104) '<unknown>'
    [2128, 2136) '<unknown>'
    [2160, 2168) '<unknown>'
    [2192, 2200) '<unknown>'
    [2224, 2232) '<unknown>'
    [2256, 2264) '<unknown>'
    [2288, 2296) '<unknown>'
    [2320, 2328) '<unknown>'
    [2352, 2360) '<unknown>'
    [2384, 2392) '<unknown>'
    [2416, 2424) '<unknown>'
    [2448, 2456) '<unknown>'
    [2480, 2488) '<unknown>'
    [2512, 2520) '<unknown>'
    [2544, 2552) '<unknown>'
    [2576, 2584) '<unknown>'
    [2608, 2616) '<unknown>'
    [2640, 2648) '<unknown>'
    [2672, 2680) '<unknown>'
    [2704, 2712) '<unknown>'
    [2736, 2744) '__i'
    [2768, 2776) '<unknown>'
    [2800, 2808) '__next'
    [2832, 2840) '__pos'
    [2864, 2872) '__it'
    [2896, 2904) '<unknown>'
    [2928, 2936) '<unknown>'
    [2960, 2968) '<unknown>'
    [2992, 3000) '<unknown>'
    [3024, 3032) '<unknown>'
    [3056, 3064) '<unknown>'
    [3088, 3096) '<unknown>'
    [3120, 3128) '<unknown>'
    [3152, 3160) '<unknown>'
    [3184, 3192) '<unknown>'
    [3216, 3224) '<unknown>'
    [3248, 3256) '__middle'
    [3280, 3288) '<unknown>'
    [3312, 3320) '<unknown>'
    [3344, 3352) '<unknown>'
    [3376, 3384) '<unknown>'
    [3408, 3416) '<unknown>'
    [3440, 3448) '<unknown>'
    [3472, 3480) '<unknown>'
    [3504, 3512) '<unknown>'
    [3536, 3544) '<unknown>'
    [3568, 3576) '<unknown>'
    [3600, 3608) '<unknown>'
    [3632, 3640) '<unknown>'
    [3664, 3672) '<unknown>'
    [3696, 3704) '__middle'
    [3728, 3736) '<unknown>'
    [3760, 3768) '<unknown>'
    [3792, 3800) '<unknown>'
    [3824, 3832) '<unknown>'
    [3856, 3864) '<unknown>'
    [3888, 3896) '<unknown>'
    [3920, 3928) '<unknown>'
    [3952, 3964) 'tdcForToAOnset' (line 550) <== Memory access at offset 3964 overflows this variable
    [3984, 3996) '__val'
    [4016, 4032) '<unknown>'
    [4048, 4064) '<unknown>'
    [4080, 4096) '<unknown>'
    [4112, 4128) '__node'
    [4144, 4160) '<unknown>'
    [4176, 4192) '<unknown>'
    [4208, 4224) '<unknown>'
    [4240, 4264) 'hitRefs' (line 556)
    [4304, 4440) '<unknown>'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T3 created by T0 here:
    #0 0x1455cf04a136 in __interceptor_pthread_create ../../../../libsanitizer/asan/asan_interceptors.cpp:207
    #1 0x1455cf6e75ff in tbb::detail::r1::rml::internal::thread_monitor::launch(void* (*)(void*), void*, unsigned long) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2022.0.0-30a7b18d020d40d676b453f939c55d06/tbb-v2022.0.0/src/tbb/rml_thread_monitor.h:208
    #2 0x1455cf6e75ff in tbb::detail::r1::rml::private_worker::wake_or_launch() /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2022.0.0-30a7b18d020d40d676b453f939c55d06/tbb-v2022.0.0/src/tbb/private_server.cpp:305
    #3 0x1455cf6e75ff in tbb::detail::r1::rml::private_server::wake_some(int) /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/tbb/v2022.0.0-30a7b18d020d40d676b453f939c55d06/tbb-v2022.0.0/src/tbb/private_server.cpp:412

SUMMARY: AddressSanitizer: stack-buffer-overflow (/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_ASAN_X_2025-05-09-2300/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so+0x8d613) in HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*)
Shadow bytes around the buggy address:
  0x028b2de871d0: 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2
  0x028b2de871e0: 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2
  0x028b2de871f0: 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2 00 f2 f2 f2
  0x028b2de87200: 00 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2
  0x028b2de87210: f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2
=>0x028b2de87220: 00[04]f2 f2 f8 f8 f2 f2 00 00 f2 f2 00 00 f2 f2
  0x028b2de87230: f8 f8 f2 f2 f8 f8 f2 f2 00 00 f2 f2 f8 f8 f2 f2
  0x028b2de87240: f8 f8 f2 f2 00 00 00 f2 f2 f2 f2 f2 f8 f8 f8 f8
  0x028b2de87250: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f3 f3 f3
  0x028b2de87260: f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
  0x028b2de87270: 00 00 f1 f1 f1 f1 00 00 f2 f2 00 00 f2 f2 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2248829==ABORTING

UBSAN:

/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/include/c++/12.3.1/bits/stl_iterator.h:1154:45: runtime error: applying non-zero offset 18446744073709551608 to null pointer
    #0 0x1475d75809ec in __gnu_cxx::__normal_iterator<std::pair<float, float>*, std::vector<std::pair<float, float>, std::allocator<std::pair<float, float> > > >::operator-(long) const /data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/include/c++/12.3.1/bits/stl_iterator.h:1154
    #1 0x1475d75809ec in std::vector<std::pair<float, float>, std::allocator<std::pair<float, float> > >::back() /data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/include/c++/12.3.1/bits/stl_vector.h:1231
    #2 0x1475d75809ec in HGCDigitizer::accumulate(edm::Handle<std::vector<PCaloHit, std::allocator<PCaloHit> > > const&, int, HGCalGeometry const*, CLHEP::HepRandomEngine*) src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:671
    #3 0x1475d758530f in HGCDigitizer::accumulate(edm::Event const&, edm::EventSetup const&, CLHEP::HepRandomEngine*) src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:371
    #4 0x1475dade88ae in edm::MixingModule::accumulateEvent(edm::Event const&, edm::EventSetup const&) src/SimGeneral/MixingModule/plugins/MixingModule.cc:686
    #5 0x1475dade8bb4 in edm::MixingModule::addSignals(edm::Event const&, edm::EventSetup const&) src/SimGeneral/MixingModule/plugins/MixingModule.cc:367
    #6 0x1475da0eff2b in edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) src/Mixing/Base/src/BMixingModule.cc:295
    #7 0x1477a545b9cf in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) src/FWCore/Framework/src/stream/EDProducerAdaptorBase.cc:83

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Sat May 10 11:39:42 CEST 2025
Thread 1 (Thread 0x1477a09fe580 (LWP 917425) "cmsRun"):
#3  0x00001477938149c7 in (anonymous namespace)::sig_dostack_then_abort (sig=<optimized out>) at src/FWCore/Services/plugins/InitRootHandlers.cc:548
#4  <signal handler called>
#5  0x00001475d757738c in HGCDigitizer::accumulate (this=this@entry=0x1475f2d77510, hits=..., bxCrossing=bxCrossing@entry=0, geom=0x14751fea87c0, hre=hre@entry=0x147565c40a00) at src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:671
#6  0x00001475d7585310 in HGCDigitizer::accumulate (this=0x1475f2d77510, e=..., eventSetup=..., hre=0x147565c40a00) at src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:371
#7  0x00001475dade88af in edm::MixingModule::accumulateEvent (this=this@entry=0x1475f1e9f200, event=..., setup=...) at src/SimGeneral/MixingModule/plugins/MixingModule.cc:686
#8  0x00001475dade8bb5 in edm::MixingModule::addSignals (this=0x1475f1e9f200, e=..., setup=...) at src/SimGeneral/MixingModule/plugins/MixingModule.cc:367
#9  0x00001475da0eff2c in edm::BMixingModule::produce (this=0x1475f1e9f200, e=..., setup=...) at src/Mixing/Base/src/BMixingModule.cc:295
#10 0x00001477a545b9d0 in edm::stream::EDProducerAdaptorBase::doEvent (this=this@entry=0x1475f1eb1840, info=..., act=0x147797431e10, mcc=mcc@entry=0x14757a084fe8) at src/FWCore/Framework/src/stream/EDProducerAdaptorBase.cc:83

dan131riley avatar May 12 '25 11:05 dan131riley

@pfs how goes the fix?

Dr15Jones avatar May 28 '25 22:05 Dr15Jones

I identified already a few things up stream in the DIGI producer. So far it's all related to hard-coded reliance on vectors / arrays of size 3 and the final geometry introduces a 4th sensor type that leads to memory corruption. I hope to conclude soon. The current snapshot is here

https://github.com/cms-sw/cmssw/compare/master...CMS-HGCAL:cmssw:dev/fix_v19_digistep?expand=1

pfs avatar May 29 '25 17:05 pfs

Crash observed, probably from one of the uninitialized data instances:

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Wed Jun  4 12:49:13 CEST 2025
Thread 7 (Thread 0x151cd291b700 (LWP 2219283) "cmsRun"):
#2  0x0000151d1fba2d70 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x0000151d244c9204 in __ieee754_exp_fma () from /lib64/libm.so.6
#5  0x0000151d24461b53 in expf64 () from /lib64/libm.so.6
#6  0x0000151d1d974bfc in CLHEP::RandPoissonQ::poissonDeviateSmall (mean=0.00018749999580904841, e=0x151bfb901980) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/clhep/2.4.7.1-d3a3e353d370e701238f7949a0d7909f/clhep-2.4.7.1/Random/src/RandPoissonQ.cc:308
#7  CLHEP::RandPoissonQ::poissonDeviateSmall (e=0x151bfb901980, mean=0.00018749999580904841) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/clhep/2.4.7.1-d3a3e353d370e701238f7949a0d7909f/clhep-2.4.7.1/Random/src/RandPoissonQ.cc:266
#8  0x0000151c544e5a24 in HcalSiPMHitResponse::addPEnoise(CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libSimCalorimetryHcalSimAlgos.so
#9  0x0000151c544e8445 in HcalSiPMHitResponse::finalizeHits(CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libSimCalorimetryHcalSimAlgos.so
#10 0x0000151c5450f16a in CaloTDigitizer<HcalQIE11DigitizerTraits, CaloTDigitizerQIE1011Run>::run(HcalDataFrameContainer<QIE11DataFrame>&, CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libSimCalorimetryHcalSimProducers.so
#11 0x0000151c54516cee in HcalDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libSimCalorimetryHcalSimProducers.so
#12 0x0000151c54508964 in HcalDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libSimCalorimetryHcalSimProducers.so
#13 0x0000151c5476976b in edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so
#14 0x0000151c546ce341 in edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libMixingBase.so
#15 0x0000151d2664f235 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so

Thread 6 (Thread 0x151cd1f1a700 (LWP 2219284) "cmsRun"):
#2  0x0000151d1fba2d70 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
[...]
#26 je_malloc_default (size=<optimized out>) at src/jemalloc.c:2722
#27 0x0000151d25a93339 in fallback_impl<false> (size=16) at src/jemalloc_cpp.cpp:98
#28 0x0000151c544b2a0c in HGCDigitizer::initializeEvent(edm::Event const&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-06-04-1100/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so
#29 0x0000151c5476971b in edm::MixingModule::initializeEvent(edm::Event const&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so
#30 0x0000151c546ce2fb in edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libMixingBase.so
#31 0x0000151d2664f235 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so

Thread 5 (Thread 0x151cd0dff700 (LWP 2219285) "cmsRun"):
#2  0x0000151d1fba6144 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x0000151c54494ad9 in HGCFEElectronics<HGCDataFrame<DetId, HGCSample> >::runShaperWithToT(HGCDataFrame<DetId, HGCSample>&, std::array<float, 15ul>&, std::array<float, 15ul>&, CLHEP::HepRandomEngine*, unsigned int, float, unsigned int, float, int, float, float, std::array<float, 6ul> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-06-04-1100/lib/el8_amd64_gcc12/libSimCalorimetryHGCalSimProducers.so
#5  0x0000151c544c2397 in HGCDigitizerBase::runSimple(std::unique_ptr<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > >, std::default_delete<edm::SortedCollection<HGCDataFrame<DetId, HGCSample>, edm::StrictWeakOrdering<HGCDataFrame<DetId, HGCSample> > > > >&, std::unordered_map<unsigned int, hgc_digi::HGCCellInfo, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, hgc_digi::HGCCellInfo> > >&, CaloSubdetectorGeometry const*, std::unordered_set<DetId, std::hash<DetId>, std::equal_to<DetId>, std::allocator<DetId> > const&, CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-06-04-1100/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so
#6  0x0000151c544b3e93 in HGCDigitizer::finalizeEvent(edm::Event&, edm::EventSetup const&, CLHEP::HepRandomEngine*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-06-04-1100/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so
#7  0x0000151c544b55c4 in HGCDigiProducer::finalizeEvent(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-06-04-1100/lib/el8_amd64_gcc12/pluginSimCalorimetryHGCalSimProducersPlugins.so
#8  0x0000151c5476976b in edm::MixingModule::finalizeEvent(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginSimGeneralMixingModulePlugins.so
#9  0x0000151c546ce341 in edm::BMixingModule::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libMixingBase.so
#10 0x0000151d2664f235 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libFWCoreFramework.so

Thread 1 (Thread 0x151d25ce1580 (LWP 2218849) "cmsRun"):
#2  0x0000151d1fba2d70 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x0000151cf7576c6d in HcalPulseContainmentCorrection::getCorrection(double) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libCalibCalorimetryHcalAlgos.so
#5  0x0000151cf7a3b9bc in HcaluLUTTPGCoder::update(HcalDbService const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/libCalibCalorimetryHcalTPGAlgos.so
#6  0x0000151cd96542ae in HcalTPGCoderULUT::produce(HcalTPGRecord const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginCalibCalorimetryHcalTPGEventSetup.so
#7  0x0000151cd96533f0 in void edm::SerialTaskQueueChain::actionToRun<edm::eventsetup::CallbackBase<edm::ESProducer, edm::ESProducer::setWhatProduced<HcalTPGCoderULUT, std::shared_ptr<HcalTPGCoder>, HcalTPGRecord, edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> >(HcalTPGCoderULUT*, std::shared_ptr<HcalTPGCoder> (HcalTPGCoderULUT::*)(HcalTPGRecord const&), edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> const&, edm::es::Label const&)::{lambda(HcalTPGRecord const&)#1}, std::shared_ptr<HcalTPGCoder>, HcalTPGRecord, edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> >::makeProduceTask<edm::eventsetup::Callback<edm::ESProducer, edm::ESProducer::setWhatProduced<HcalTPGCoderULUT, std::shared_ptr<HcalTPGCoder>, HcalTPGRecord, edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> >(HcalTPGCoderULUT*, std::shared_ptr<HcalTPGCoder> (HcalTPGCoderULUT::*)(HcalTPGRecord const&), edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> const&, edm::es::Label const&)::{lambda(HcalTPGRecord const&)#1}, std::shared_ptr<HcalTPGCoder>, HcalTPGRecord, edm::eventsetup::CallbackSimpleDecorator<HcalTPGRecord> >::prefetchAsync(edm::WaitingTaskHolder, edm::eventsetup::EventSetupRecordImpl const*, edm::EventSetupImpl const*, edm::ServiceToken const&, edm::ESParentContext const&)::{lambda(auto:1&&, auto:2&&, auto:3&&, auto:4&&)#1}::operator()<tbb::detail::d2::task_group*&, edm::ServiceWeakToken&, edm::eventsetup::EventSetupRecordImpl const*&, edm::EventSetupImpl const*&>(tbb::detail::d2::task_group*&, edm::ServiceWeakToken&, edm::eventsetup::EventSetupRecordImpl const*&, edm::EventSetupImpl const*&) const::{lambda(HcalTPGRecord const&)#1}>(tbb::detail::d2::task_group*, edm::ServiceWeakToken const&, edm::eventsetup::EventSetupRecordImpl const*, edm::EventSetupImpl const*, bool, tbb::detail::d2::task_group*&)::{lambda(std::__exception_ptr::exception_ptr const*)#1}::operator()(std::__exception_ptr::exception_ptr const*) const::{lambda()#2}&>(tbb::detail::d2::task_group*&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02892/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-06-02-1100/lib/el8_amd64_gcc12/pluginCalibCalorimetryHcalTPGEventSetup.so

Current Modules:

Module: MixingModule:mix (crashed)
Module: MixingModule:mix
Module: none
Module: MixingModule:mix

A fatal system signal has occurred: segmentation violation
timeout: the monitored command dumped core

dan131riley avatar Jun 04 '25 14:06 dan131riley

Just to update: we are getting closer in preparing the PR. This is being prepared together with @jbsauvan and @felicepantaleo because it's not only the digitizer that needed fixes but also L1T and Reconstruction.

pfs avatar Jun 04 '25 14:06 pfs

I happen to be running the job under the debugger after building with -g and I hit the crash in accumulate

HGCDigitizer::accumulate (this=0x7fff1ef67b90, hits=..., bxCrossing=0, geom=0x7ffde61b9540, hre=0x7ffe3dbb9c80) at src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:671
671	      float fireTDC = hitRefs_bx0[id].back().second;
(gdb) print id
$1 = 2350910856

so it looks like the id is bad.

Dr15Jones avatar Jun 04 '25 15:06 Dr15Jones

On further inspection, the problem is thickness again as the condition just before that line is

https://github.com/cms-sw/cmssw/blob/6d1c5f898dfb2a195b1df0c20569a38a7848341c/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc#L669

Dr15Jones avatar Jun 04 '25 15:06 Dr15Jones

HGCDigitizer::accumulate (this=0x7fff1ef67b90, hits=..., bxCrossing=0, geom=0x7ffde61b9540, hre=0x7ffe3dbb9c80) at src/SimCalorimetry/HGCalSimProducers/plugins/HGCDigitizer.cc:671
671	      float fireTDC = hitRefs_bx0[id].back().second;
(gdb) print id
$1 = 2350910856

2350910856 == 0x8c200d88, which looks like an address you might get from a dynamic heap allocation.

dan131riley avatar Jun 04 '25 15:06 dan131riley