[amd_w7900] RelVals 29634.x timing out
First occurrence: CMSSW_16_0_X_2025-12-03-1100, RelVals 29634.402, 29634.4021 timed out. Latest occurrence: CMSSW_16_0_X_2025-12-08-1100, RelVals 29634.402, 29634.4021
Stack trace (no idea how useful it is):
Thread 32 (Thread 0x153796000700 (LWP 4142639) "cmsRun"):
#0 0x00001537ca907f22 in waitpid () from /lib64/libc.so.6
#1 0x00001537c3e469e8 in edm::service::cmssw_stacktrace_fork() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#2 0x00001537c3e48752 in edm::service::InitRootHandlers::stacktraceHelperThread() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#3 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x153799a513f0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 31 (Thread 0x153795800700 (LWP 4142645) "cmsRun"):
#0 0x00001537ca83922b in ioctl () from /lib64/libc.so.6
#1 0x00001537b5328730 in hsakmt_ioctl () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#2 0x00001537b531f4e3 in hsaKmtWaitOnMultipleEvents_Ext () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#3 0x00001537b52adeb9 in rocr::core::Signal::WaitAnyExceptions(unsigned int, hsa_signal_s const*, hsa_signal_condition_t const*, long const*, long*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#4 0x00001537b528c078 in rocr::core::Runtime::AsyncEventsLoop(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#5 0x00001537b52332cd in rocr::os::ThreadTrampoline(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#6 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#7 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 30 (Thread 0x153691400700 (LWP 4142647) "cmsRun"):
#0 0x00001537ca83922b in ioctl () from /lib64/libc.so.6
#1 0x00001537b5328730 in hsakmt_ioctl () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#2 0x00001537b531f4e3 in hsaKmtWaitOnMultipleEvents_Ext () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#3 0x00001537b528bc73 in rocr::core::Runtime::AsyncEventsLoop(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#4 0x00001537b52332cd in rocr::os::ThreadTrampoline(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libhsa-runtime64.so.1
#5 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#6 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 29 (Thread 0x153664a00700 (LWP 4142648) "cmsRun"):
#0 0x00001537ca908178 in nanosleep () from /lib64/libc.so.6
#1 0x00001537ca90807e in sleep () from /lib64/libc.so.6
#2 0x00001537c3e4687c in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#3 <signal handler called>
#4 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#5 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#6 0x00001537bf66139d in hip::MemoryPool::AllocateMemory(unsigned long, hip::Stream*, void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#7 0x00001537bf64c3ed in hip::hipMallocAsync(void**, unsigned long, ihipStream_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#8 0x000015364be2aad2 in alpaka::BufUniformCudaHipRt<alpaka::ApiHipRt, TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int> alpaka::trait::AsyncBufAlloc<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int, alpaka::DevUniformCudaHipRt<alpaka::ApiHipRt>, void>::allocAsyncBuf<alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#9 0x000015364be2a7c8 in auto alpaka::allocAsyncBuf<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, unsigned int, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#10 0x000015364be2849c in CLUEAlgoAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, HGCalSiliconTilesConstants, 96>::makeClustersCMSSW(unsigned int, float const*, float const*, int const*, float const*, float const*, unsigned int const*, float*, float*, unsigned int*, int*, unsigned char*, unsigned int*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#11 0x000015364be2808b in alpaka_rocm_async::HGCalLayerClustersAlgoWrapper::run(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>&, unsigned int, float, float, float, HGCalSoARecHitsLayout<128ul, false>::ConstViewTemplateFreeParams<128ul, false, true, true>, HGCalSoARecHitsExtraLayout<128ul, false>::ViewTemplateFreeParams<128ul, false, true, true>) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#12 0x000015364be4fb7a in alpaka_rocm_async::HGCalSoARecHitsLayerClustersProducer::produce(alpaka_rocm_async::device::Event&, alpaka_rocm_async::device::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#13 0x000015364be4d698 in alpaka_rocm_async::stream::EDProducer<>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#14 0x00001537cb85edf2 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#15 0x00001537cb845cc7 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#16 0x00001537cb7d09b2 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#17 0x00001537cb7d0ec7 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#18 0x00001537cbc3fe64 in tbb::detail::d2::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#19 0x00001537cb9d91fb in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=<optimized out>, waiter=..., this=0x1537c87b7a80) at src/tbb/task_dispatcher.h:344
#20 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x1537c87b7a80) at src/tbb/task_dispatcher.h:487
#21 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at src/tbb/arena.cpp:216
#22 tbb::detail::r1::thread_dispatcher_client::process (td=..., this=<optimized out>) at src/tbb/thread_dispatcher_client.h:41
#23 tbb::detail::r1::thread_dispatcher::process (this=<optimized out>, j=...) at src/tbb/thread_dispatcher.cpp:195
#24 0x00001537cb9ccd25 in tbb::detail::r1::rml::private_worker::run (this=0x1537c5a2d080) at src/tbb/private_server.cpp:271
#25 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1537c5a2d080) at src/tbb/private_server.cpp:221
#26 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#27 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 28 (Thread 0x153663e00700 (LWP 4142649) "cmsRun"):
#0 0x00001537ca908178 in nanosleep () from /lib64/libc.so.6
#1 0x00001537ca90807e in sleep () from /lib64/libc.so.6
#2 0x00001537c3e4687c in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#3 <signal handler called>
#4 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#5 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#6 0x00001537bf66139d in hip::MemoryPool::AllocateMemory(unsigned long, hip::Stream*, void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#7 0x00001537bf64c3ed in hip::hipMallocAsync(void**, unsigned long, ihipStream_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#8 0x000015364be2aad2 in alpaka::BufUniformCudaHipRt<alpaka::ApiHipRt, TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int> alpaka::trait::AsyncBufAlloc<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int, alpaka::DevUniformCudaHipRt<alpaka::ApiHipRt>, void>::allocAsyncBuf<alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#9 0x000015364be2a7c8 in auto alpaka::allocAsyncBuf<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, unsigned int, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#10 0x000015364be2849c in CLUEAlgoAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, HGCalSiliconTilesConstants, 96>::makeClustersCMSSW(unsigned int, float const*, float const*, int const*, float const*, float const*, unsigned int const*, float*, float*, unsigned int*, int*, unsigned char*, unsigned int*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#11 0x000015364be2808b in alpaka_rocm_async::HGCalLayerClustersAlgoWrapper::run(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>&, unsigned int, float, float, float, HGCalSoARecHitsLayout<128ul, false>::ConstViewTemplateFreeParams<128ul, false, true, true>, HGCalSoARecHitsExtraLayout<128ul, false>::ViewTemplateFreeParams<128ul, false, true, true>) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#12 0x000015364be4fb7a in alpaka_rocm_async::HGCalSoARecHitsLayerClustersProducer::produce(alpaka_rocm_async::device::Event&, alpaka_rocm_async::device::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#13 0x000015364be4d698 in alpaka_rocm_async::stream::EDProducer<>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#14 0x00001537cb85edf2 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#15 0x00001537cb845cc7 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#16 0x00001537cb7d09b2 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#17 0x00001537cb7d0ec7 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#18 0x00001537cbc3fe64 in tbb::detail::d2::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#19 0x00001537cb9d91fb in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=<optimized out>, waiter=..., this=0x1537c87b7980) at src/tbb/task_dispatcher.h:344
#20 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x1537c87b7980) at src/tbb/task_dispatcher.h:487
#21 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at src/tbb/arena.cpp:216
#22 tbb::detail::r1::thread_dispatcher_client::process (td=..., this=<optimized out>) at src/tbb/thread_dispatcher_client.h:41
#23 tbb::detail::r1::thread_dispatcher::process (this=<optimized out>, j=...) at src/tbb/thread_dispatcher.cpp:195
#24 0x00001537cb9ccd25 in tbb::detail::r1::rml::private_worker::run (this=0x1537c5a2d100) at src/tbb/private_server.cpp:271
#25 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1537c5a2d100) at src/tbb/private_server.cpp:221
#26 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#27 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 27 (Thread 0x153662e00700 (LWP 4142650) "cmsRun"):
#0 0x00001537ca908178 in nanosleep () from /lib64/libc.so.6
#1 0x00001537ca90807e in sleep () from /lib64/libc.so.6
#2 0x00001537c3e4687c in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#3 <signal handler called>
#4 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#5 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#6 0x00001537bf48bcba in hip::Device::NullStream(bool) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#7 0x00001537bf491128 in std::_Function_handler<amd::HostQueue& (), hip::MemoryPool::MemoryPool(hip::Device*, hipMemPoolProps const*, bool)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#8 0x00001537bf7bb021 in amd::VmHeap::CommitMemory(void*, unsigned long) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#9 0x00001537bf7bb289 in amd::VmHeap::MapPhysMemory(unsigned long, unsigned long) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#10 0x00001537bf7bb45f in amd::VmHeap::AllocBlock(unsigned long) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#11 0x00001537bf7bbff1 in amd::VmHeap::Alloc(unsigned long) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#12 0x00001537bf7bc2a8 in amd::VmHeapArray::Alloc(unsigned long) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#13 0x00001537bf661799 in hip::MemoryPool::AllocateMemory(unsigned long, hip::Stream*, void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#14 0x00001537bf64c3ed in hip::hipMallocAsync(void**, unsigned long, ihipStream_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#15 0x000015364be2aad2 in alpaka::BufUniformCudaHipRt<alpaka::ApiHipRt, TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int> alpaka::trait::AsyncBufAlloc<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int, alpaka::DevUniformCudaHipRt<alpaka::ApiHipRt>, void>::allocAsyncBuf<alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#16 0x000015364be2a7c8 in auto alpaka::allocAsyncBuf<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, unsigned int, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#17 0x000015364be2849c in CLUEAlgoAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, HGCalSiliconTilesConstants, 96>::makeClustersCMSSW(unsigned int, float const*, float const*, int const*, float const*, float const*, unsigned int const*, float*, float*, unsigned int*, int*, unsigned char*, unsigned int*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#18 0x000015364be2808b in alpaka_rocm_async::HGCalLayerClustersAlgoWrapper::run(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>&, unsigned int, float, float, float, HGCalSoARecHitsLayout<128ul, false>::ConstViewTemplateFreeParams<128ul, false, true, true>, HGCalSoARecHitsExtraLayout<128ul, false>::ViewTemplateFreeParams<128ul, false, true, true>) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#19 0x000015364be4fb7a in alpaka_rocm_async::HGCalSoARecHitsLayerClustersProducer::produce(alpaka_rocm_async::device::Event&, alpaka_rocm_async::device::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#20 0x000015364be4d698 in alpaka_rocm_async::stream::EDProducer<>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#21 0x00001537cb85edf2 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#22 0x00001537cb845cc7 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#23 0x00001537cb7d09b2 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#24 0x00001537cb7d0ec7 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#25 0x00001537cbc3fe64 in tbb::detail::d2::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#26 0x00001537cb9d91fb in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=<optimized out>, waiter=..., this=0x1537c87b7b80) at src/tbb/task_dispatcher.h:344
#27 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x1537c87b7b80) at src/tbb/task_dispatcher.h:487
#28 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at src/tbb/arena.cpp:216
#29 tbb::detail::r1::thread_dispatcher_client::process (td=..., this=<optimized out>) at src/tbb/thread_dispatcher_client.h:41
#30 tbb::detail::r1::thread_dispatcher::process (this=<optimized out>, j=...) at src/tbb/thread_dispatcher.cpp:195
#31 0x00001537cb9ccd25 in tbb::detail::r1::rml::private_worker::run (this=0x1537c5a2d000) at src/tbb/private_server.cpp:271
#32 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1537c5a2d000) at src/tbb/private_server.cpp:221
#33 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#34 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 26 (Thread 0x153661a00700 (LWP 4142651) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 25 (Thread 0x153661600700 (LWP 4142652) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 24 (Thread 0x153661200700 (LWP 4142653) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 23 (Thread 0x153660e00700 (LWP 4142654) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 22 (Thread 0x153660a00700 (LWP 4142655) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 21 (Thread 0x153660600700 (LWP 4142656) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 20 (Thread 0x153660200700 (LWP 4142657) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 19 (Thread 0x15365fe00700 (LWP 4142658) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x15365fa00700 (LWP 4142659) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 17 (Thread 0x15365f600700 (LWP 4142660) "cmsRun"):
#0 0x00001537ca93f487 in epoll_wait () from /lib64/libc.so.6
#1 0x00001537c3d29027 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c3d24ed5 in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#3 0x00001537c3d2e7c8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 16 (Thread 0x15365f200700 (LWP 4142661) "cmsRun"):
#0 0x00001537ca908178 in nanosleep () from /lib64/libc.so.6
#1 0x00001537c3d2f10d in XrdSysTimer::Wait(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdUtils.so.3
#2 0x00001537c2b448da in XrdCl::TaskManager::RunTasks() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#3 0x00001537c2b44a69 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x15365ee00700 (LWP 4142662) "cmsRun"):
#0 0x00001537c9c10a46 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00001537c9c10b38 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00001537c2bd7396 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#3 0x00001537c2bd7449 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x15365ea00700 (LWP 4142663) "cmsRun"):
#0 0x00001537c9c10a46 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00001537c9c10b38 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00001537c2bd7396 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#3 0x00001537c2bd7449 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x15365e600700 (LWP 4142664) "cmsRun"):
#0 0x00001537c9c10a46 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00001537c9c10b38 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00001537c2bd7396 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#3 0x00001537c2bd7449 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libXrdCl.so.3
#4 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#5 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x1535e2e00700 (LWP 4142665) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015367fb71a96 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#2 0x000015367fb720c3 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#3 0x000015367fb6f708 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#4 0x000015366ecbc2f1 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_framework.so.2
#5 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#6 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x1535e2a00700 (LWP 4142666) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015367fb71a96 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#2 0x000015367fb720c3 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#3 0x000015367fb6f708 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#4 0x000015366ecbc2f1 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_framework.so.2
#5 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#6 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x1535e1e00700 (LWP 4142667) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015367fb71a96 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tsl::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#2 0x000015367fb720c3 in Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#3 0x000015367fb6f708 in std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_cc.so.2
#4 0x000015366ecbc2f1 in tsl::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libtensorflow_framework.so.2
#5 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#6 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x1535e1600700 (LWP 4142670) "edm async pool"):
#0 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2 0x00001537bf481b0d in hip::Device::ReleaseFreedMemory() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#3 0x00001537bf49588b in hip::hipEventSynchronize(ihipEvent_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#4 0x000015366bbf1a76 in std::_Function_handler<void (), edm::impl::WaitingThread::run<alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}>(edm::WaitingTaskWithArenaHolder, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}&&, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}&&, std::shared_ptr<edm::impl::WaitingThread>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libHeterogeneousCoreAlpakaCoreROCmAsync.so
#5 0x00001537cbc41335 in edm::impl::WaitingThread::threadLoop() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#6 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x15352e76ce00) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#7 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#8 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x153522600700 (LWP 4142671) "edm async pool"):
#0 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2 0x00001537bf65efda in hip::MemoryPool::ReleaseFreedMemory() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#3 0x00001537bf481b39 in hip::Device::ReleaseFreedMemory() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#4 0x00001537bf49588b in hip::hipEventSynchronize(ihipEvent_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#5 0x000015366bbf1a76 in std::_Function_handler<void (), edm::impl::WaitingThread::run<alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}>(edm::WaitingTaskWithArenaHolder, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}&&, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}&&, std::shared_ptr<edm::impl::WaitingThread>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libHeterogeneousCoreAlpakaCoreROCmAsync.so
#6 0x00001537cbc41335 in edm::impl::WaitingThread::threadLoop() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#7 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x1535500a8200) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#8 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#9 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x1534d7600700 (LWP 4142672) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015366acf7e20 in alpaka::core::CallbackThread::startWorkerThread()::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalTrackerSiPixelRecHitsPluginsPortableROCmAsync.so
#2 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x153690a01020) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#3 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#4 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x153356200700 (LWP 4142673) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015366acf7e20 in alpaka::core::CallbackThread::startWorkerThread()::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalTrackerSiPixelRecHitsPluginsPortableROCmAsync.so
#2 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x153690a01010) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#3 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#4 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x15334a400700 (LWP 4142674) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015366acf7e20 in alpaka::core::CallbackThread::startWorkerThread()::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalTrackerSiPixelRecHitsPluginsPortableROCmAsync.so
#2 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x153690a01040) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#3 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#4 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x153349c00700 (LWP 4142675) "edm async pool"):
#0 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2 0x00001537bf481b0d in hip::Device::ReleaseFreedMemory() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#3 0x00001537bf49588b in hip::hipEventSynchronize(ihipEvent_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#4 0x000015366bbf1a76 in std::_Function_handler<void (), edm::impl::WaitingThread::run<alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}>(edm::WaitingTaskWithArenaHolder, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}&&, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}&&, std::shared_ptr<edm::impl::WaitingThread>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libHeterogeneousCoreAlpakaCoreROCmAsync.so
#5 0x00001537cbc41335 in edm::impl::WaitingThread::threadLoop() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#6 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x15357756b360) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#7 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#8 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x153348a00700 (LWP 4142676) "cmsRun"):
#0 0x00001537c9c0e371 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000015366acf7e20 in alpaka::core::CallbackThread::startWorkerThread()::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalTrackerSiPixelRecHitsPluginsPortableROCmAsync.so
#2 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x153690a01050) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#3 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#4 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x153347c00700 (LWP 4142677) "edm async pool"):
#0 0x00001537c9c114cd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2 0x00001537bf481b0d in hip::Device::ReleaseFreedMemory() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#3 0x00001537bf49588b in hip::hipEventSynchronize(ihipEvent_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#4 0x000015366bbf1a76 in std::_Function_handler<void (), edm::impl::WaitingThread::run<alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}>(edm::WaitingTaskWithArenaHolder, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#1}&&, alpaka_rocm_async::EDMetadata::enqueueCallback(edm::WaitingTaskWithArenaHolder)::{lambda()#2}&&, std::shared_ptr<edm::impl::WaitingThread>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libHeterogeneousCoreAlpakaCoreROCmAsync.so
#5 0x00001537cbc41335 in edm::impl::WaitingThread::threadLoop() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#6 0x00001537cace2204 in std::execute_native_thread_routine (__p=0x1534df2cd620) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#7 0x00001537c9c081ca in start_thread () from /lib64/libpthread.so.0
#8 0x00001537ca8398d3 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x1537cb943580 (LWP 4142623) "cmsRun"):
#0 0x00001537ca932bb1 in poll () from /lib64/libc.so.6
#1 0x00001537c3e4897e in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#2 0x00001537c3e48b83 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginFWCoreServicesPlugins.so
#3 <signal handler called>
#4 0x00001537c9c114cb in __lll_lock_wait () from /lib64/libpthread.so.0
#5 0x00001537c9c0ab94 in pthread_mutex_lock () from /lib64/libpthread.so.0
#6 0x00001537bf66139d in hip::MemoryPool::AllocateMemory(unsigned long, hip::Stream*, void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#7 0x00001537bf64c3ed in hip::hipMallocAsync(void**, unsigned long, ihipStream_t*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/external/el8_amd64_gcc13/lib/libamdhip64.so.7
#8 0x000015364be2aad2 in alpaka::BufUniformCudaHipRt<alpaka::ApiHipRt, TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int> alpaka::trait::AsyncBufAlloc<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, std::integral_constant<unsigned long, 1ul>, unsigned int, alpaka::DevUniformCudaHipRt<alpaka::ApiHipRt>, void>::allocAsyncBuf<alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#9 0x000015364be2a7c8 in auto alpaka::allocAsyncBuf<TilesAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, HGCalSiliconTilesConstants>, unsigned int, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false> >(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, alpaka::Vec<std::integral_constant<unsigned long, 1ul>, unsigned int> const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#10 0x000015364be2849c in CLUEAlgoAlpaka<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiHipRt, std::integral_constant<unsigned long, 1ul>, unsigned int>, alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>, HGCalSiliconTilesConstants, 96>::makeClustersCMSSW(unsigned int, float const*, float const*, int const*, float const*, float const*, unsigned int const*, float*, float*, unsigned int*, int*, unsigned char*, unsigned int*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#11 0x000015364be2808b in alpaka_rocm_async::HGCalLayerClustersAlgoWrapper::run(alpaka::uniform_cuda_hip::detail::QueueUniformCudaHipRt<alpaka::ApiHipRt, false>&, unsigned int, float, float, float, HGCalSoARecHitsLayout<128ul, false>::ConstViewTemplateFreeParams<128ul, false, true, true>, HGCalSoARecHitsExtraLayout<128ul, false>::ViewTemplateFreeParams<128ul, false, true, true>) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#12 0x000015364be4fb7a in alpaka_rocm_async::HGCalSoARecHitsLayerClustersProducer::produce(alpaka_rocm_async::device::Event&, alpaka_rocm_async::device::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#13 0x000015364be4d698 in alpaka_rocm_async::stream::EDProducer<>::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/pluginRecoLocalCaloHGCalRecProducersPluginsPortableROCmAsync.so
#14 0x00001537cb85edf2 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#15 0x00001537cb845cc7 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#16 0x00001537cb7d09b2 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#17 0x00001537cb7d0ec7 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::TransitionActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#18 0x00001537cbc3fe64 in tbb::detail::d2::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreConcurrency.so
#19 0x00001537cb9d12f3 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=0x1535f2b20900, this=<optimized out>) at src/tbb/task_dispatcher.h:344
#20 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=<optimized out>) at src/tbb/task_dispatcher.h:487
#21 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at src/tbb/task_dispatcher.cpp:169
#22 0x00001537cb74a3df in edm::FinalWaitingTask::wait() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#23 0x00001537cb75f989 in edm::EventProcessor::processRuns() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#24 0x00001537cb759761 in edm::EventProcessor::runToCompletion() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc13/cms/cmssw/CMSSW_16_0_X_2025-12-08-1100/lib/el8_amd64_gcc13/libFWCoreFramework.so
#25 0x000000000040916d in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#26 0x00001537cb9c08c2 in tbb::detail::r1::task_arena_impl::execute (ta=..., d=...) at src/tbb/arena.cpp:860
#27 0x000000000040abfe in main::{lambda()#1}::operator()() const ()
#28 0x0000000000405268 in main ()
Threads statuses
1️⃣ CMSSW Stacktrace ThreadThread 32
- Running
InitRootHandlers::stacktraceHelperThread() - Idle, waiting to collect stack traces when signaled.
✔ Normal, not related to the stall.
2️⃣ ROCm Runtime Event Threads
Threads 31 & 30
- In
hsaKmtWaitOnMultipleEvents_Ext() - Part of ROCr asynchronous event dispatcher.
✔ Normal GPU event-loop threads waiting for hardware signals.
3️⃣ TBB Worker Threads Running HGCal GPU Reconstruction (Blocked)
Threads 29, 28, 27
All show:
- Interrupted by CMSSW's stacktrace signal
- Blocked in
pthread_mutex_lock()insidehip::MemoryPool::AllocateMemory()→hipMallocAsync() - Called from HGCal GPU code (
CLUEAlgoAlpaka,HGCalLayerClustersAlgoWrapper, etc.)
❗ These workers are stuck trying to allocate GPU memory asynchronously.
4️⃣ EDM Async Pool Threads Waiting on ROCm/HIP (Also Blocked)
Threads 9, 8, 4, 2
All show:
- Blocking in HIP memory or event routines (
hipEventSynchronize,Device::ReleaseFreedMemory) - Running inside
edm::impl::WaitingThread(async GPU-task waiters)
❗ These threads are waiting for GPU events or memory release, but the HIP allocator is blocked—so they never progress.
5️⃣ Alpaka Callback Worker Threads
Threads 7, 6, 5, 3
- In
pthread_cond_wait() - Waiting for callbacks from GPU work queues.
✔ Idle; not directly part of the problem.
6️⃣ XRootD I/O Worker Threads
Threads 26–21, 20–17, plus the timer and job threads (16, 15, 14, 13)
- All blocked in either
epoll_wait,sem_wait, orsleep - Part of XRootD client network/event handling.
✔ Normal idle behavior.
7️⃣ TensorFlow / Eigen ThreadPool Workers
Threads 12, 11, 10
- All waiting in
pthread_cond_wait()inside Eigen threadpool.
✔ Idle; unrelated.
8️⃣ Main Thread (Critical)
Thread 1
Main event loop is inside:
- HGCal GPU clustering →
hipMallocAsync() - Blocked in
pthread_mutex_lock()insidehip::MemoryPool::AllocateMemory()
❗ The main thread is waiting on the same HIP memory-pool mutex as the blocked worker threads.
cms-bot internal usage
A new Issue was created by @iarspider.
@Dr15Jones, @ftenchini, @makortel, @mandrenguyen, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign from RecoLocalCalo/HGCalRecProducers
assign heterogeneous
New categories assigned: reconstruction,heterogeneous
@fwyzard,@jfernan2,@makortel,@mandrenguyen,@srimanob you have been requested to review this Pull request/Issue and eventually sign? Thanks
(no idea how useful it is)
It is useful.
- Threads 30, 31 are in
ioctl()within ROCr - Threads 1, 28, 29 are waiting to lock a mutex via
hip::MemoryPool::AllocateMemory() - Thread 27 is waiting to lock a mutex via
hip::Device::NullStream()that is called viahip::MemoryPool::AllocateMemory() - Threads 2, 4, 8, 9 are waiting to lock a mutex via
hip::MemoryPool::ReleaseFreedMemory()
First hypothesis: the threads in ioctl() would be holding the lock(s) the other threads waiting to lock. We have other issues https://github.com/cms-sw/cmssw/issues/49288 https://github.com/cms-sw/cmssw/issues/49464 that show threads in the ioctl() and other threads not being stuck in locks, so I'd guess this hypothesis is not correct.
Second hypothesis: we are seeing a (rare?) deadlock in the ROCm runtime. The behavior of thread 27 looks suspicious, but without taking a look on how the locks are used in hip::MemoryPool() it is hard to say more.
Based on the stack traces the problem does not seem to be in our code. Code allocating device memory in parallel in 4 threads should be ~fine.
- Thread 1, 28 and 29 are waiting on the MemoryPool lock.
- Thread 27 is holding the MemoryPool lock and waiting on the device lock.
- 2, 4, and 9 are waiting on the device lock.
- Thread 8 has the device lock and is waiting on the MemoryPool lock.
Looks like a classic deadlock from inconsistent lock ordering.
Good catch Dan! AFAICT (from the stack traces) the problem is in AMD's libraries. I suppose we should report it to AMD, @fwyzard would you be able to do that?
I can report this, but we need some way to reproduce it.
Also, do we have the full logs of the jobs that were affected ? The ROCmService might have some information about the AMD GPU driver in use.
OK, I checked the last one, and it's using the same (older) AMD GPU driver.
Which means I'm not sure what changes to cause this. The last update (to ROCm 7.1.0) was on November 28th - are the hangs rare enough that this is that cause, but we went a few days without observing any ?
OK, I was able to reproduce it quickly enough with CMSSW 16.0.0-pre3 and reusing the same input as the IB tests.
I will try to check if it still happens without our memory pool, and then report it to AMD.
by the way, workflow 29834.402/step2 also hangs for AMD mi300x and job was killed after 9000s. Strangely this happens only on some of AMD mi300x hosts. e.g. https://cmssdt.cern.ch/jenkins/job/ib-run-relvals/385962/ job took 2h17 to run all gpu tests on our NGT session ngt-amd-mi300x-01 while https://cmssdt.cern.ch/jenkins/job/ib-run-relvals/386003/console (which is running on ngt-amd-mi300x-04) is hanging. I have asked CERN NGT team to look in to it.
07:31:32 Running command: cd 29834.402_TTbar_14TeV+Run4D110PU_Patatrack_PixelOnlyAlpaka; HIP_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES= cmsDriver.py step2 -s DIGI:pdigi_valid,L1TrackTrigger,L1,L1P2GT,DIGI2RAW,HLT:@relvalRun4 --conditions auto:phase2_realistic_T33 --datatier GEN-SIM-DIGI-RAW -n 10 --eventcontent FEVTDEBUGHLT --geometry ExtendedRun4D110 --era Phase2C17I13M9 --procModifiers alpaka --customise HeterogeneousCore/AlpakaServices/customiseAlpakaServiceMemoryFilling.customiseAlpakaServiceMemoryFilling --pileup AVE_200_BX_25ns --pileup_input das:/RelValMinBias_14TeV/CMSSW_15_1_0_pre5-150X_mcRun4_realistic_v1_STD_RegeneratedGS_Run4D110_noPU-v1/GEN-SIM --customise Validation/Performance/TimeMemorySummary.customiseWithTimeMemorySummary --prefix 'python3 /scratch/cmsbuild/jenkins/workspace/ib-run-relvals/cms-bot/monitor_workflow.py timeout --signal SIGTERM 9000 ' --filein filelist:step1_dasquery.log --fileout file:step2.root --suffix "-j JobReport2.xml " --nThreads 4 > step2_TTbar_14TeV+Run4D110PU_Patatrack_PixelOnlyAlpaka.log 2>&1
10:01:47 ===> SLOW JOB: 9015 secs vs 211 secs. Diff: 8804
@smuzaffar is there a way to check the content of the logs, e.g. of step2_TTbar_14TeV+2026PU_Patatrack_PixelOnlyAlpaka_Validation.log for the working and hanging jobs ?
One machine has different drivers, I would check if that has an impact for the MI300X.
If I replace our memory pool with calls to the native ROCm asynchronous memory allocation, I don't observe any dead lock over more than 100 tests.
@fwyzard , I have send you instruction on how to get in to ngt session to access the logs
Thanks.
OK, I don't see any differences in the drivers and software setup between the two workers:
cmsbuild@ngt-amd-mi300x-01
AMD kernel driver: 6.12.12
ROCm driver API: 7.1.25424 (compiled with ROCm 7.1.0.0-20-4179531dcd)
ROCm runtime API: 7.1.25424 (compiled with HIP 7.1.25424)
ngt-amd-mi300x-04
AMD kernel driver: 6.12.12
ROCm driver API: 7.1.25424 (compiled with ROCm 7.1.0.0-20-4179531dcd)
ROCm runtime API: 7.1.25424 (compiled with HIP 7.1.25424)
@fwyzard , thanks for looking in to it. I noticed that 29834.402/step2 and 29834.4021/step2 randomly hang (and then killed by timeout after 9000s). https://cmssdt.cern.ch/jenkins/job/ib-run-relvals/386049/consoleFull job ran on ngt-amd-mi300x-04 and both steps were timed out while for https://cmssdt.cern.ch/jenkins/job/ib-run-relvals/386126/consoleFull (ran on ngt-amd-mi300x-03) only one of these hanged. For https://cmssdt.cern.ch/jenkins/job/ib-run-relvals/386098/consoleFull every thing worked fine.
I think issue might be in CMSSW itself
randomly hang
The randomness is compatible with the deadlock diagnosis of https://github.com/cms-sw/cmssw/issues/49570#issuecomment-3632703723