PromptReco failure PromptReco_Run381379_ParkingSingleMuon4
From https://cms-talk.web.cern.ch/t/paused-job-for-promptreco-run381379-parkingsinglemuon4/42082
----- Begin Fatal Exception 06-Jun-2024 16:58:22 CEST-----------------------
An exception of category 'FileReadError' occurred while
[0] Processing Event run: 381379 lumi: 819 event: 1742750619 stream: 2
[1] Running path 'write_AOD_step'
[2] Prefetching for module PoolOutputModule/'write_AOD'
[3] While reading from source GlobalObjectMapRecord hltGtStage2ObjectMap '' HLT
[4] Rethrowing an exception that happened on a different read request.
[5] Processing Event run: 381379 lumi: 819 event: 1742683577 stream: 4
[6] Running path 'dqmoffline_step'
[7] Prefetching for module DQMMessageLogger/'DQMMessageLogger'
[8] Prefetching for module LogErrorHarvester/'logErrorHarvester'
[9] Prefetching for module CSCRecHitDProducer/'csc2DRecHits'
[10] Prefetching for module CSCDCCUnpacker/'muonCSCDigis'
[11] While reading from source FEDRawDataCollection rawDataCollector '' LHC
[12] Reading branch FEDRawDataCollection_rawDataCollector__LHC.
Exception Message:
vector::_M_default_append
----- End Fatal Exception -------------------------------------------------
The tarball can be found here:
/afs/cern.ch/user/c/cmst0/public/PausedJobs/Run2024E/FileReadError/job/WMTaskSpace/cmsRun1 From the logs it seems to crash at event 1742503164. The error is reproducible locally.
cms-bot internal usage
A new Issue was created by @Dr15Jones.
@antoniovilela, @sextonkennedy, @smuzaffar, @makortel, @rappoccio, @Dr15Jones can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
The job can be run by setting up a CMSSW_14_0_7 area, downloading the tarball (which is at /afs/cern.ch/user/c/cmst0/public/PausedJobs/Run2024E/FileReadError/a406cf00-00a4-498e-b7e2-9ec39b964fac-216-3-logArchive.tar.gz )
Then after untarring go to directory job/WMTaskSpace/cmsRun1 and then do
cmsRun PSet.py
There appear to be lots of extraneous exceptions being thrown (and caught) in this job. The first one encountered is
%MSG-e SiStripMonitorTrack: SiStripMonitorTrack:HLTSiStripMonitorTrack 06-Jun-2024 17:43:09 CEST Run: 381379 Event: 1741662696
ClusterCollection is not valid!!
%MSG
[Switching to Thread 0x7fffa05fe640 (LWP 3001818)]
Thread 7 "cmsRun" hit Catchpoint 1 (exception thrown), 0x00007ffff5ead0f1 in __cxxabiv1::__cxa_throw (obj=0x7ffde5f68b00, tinfo=0x7ffff79a0650 <typeinfo for edm::Exception>,
dest=0x7ffff796a010 <edm::Exception::~Exception()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:81
81 ../../../../libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory.
(gdb) where
#0 0x00007ffff5ead0f1 in __cxxabiv1::__cxa_throw (obj=0x7ffde5f68b00, tinfo=0x7ffff79a0650 <typeinfo for edm::Exception>, dest=0x7ffff796a010 <edm::Exception::~Exception()>)
at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:81
#1 0x00007ffff7b7e0b2 in throwInvalidRefFromNullOrInvalidRef(edm::TypeID const&) ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libDataFormatsCommon.so
#2 0x00007ffff7b7ed6f in edm::RefCore::tryToGetProductPtr(std::type_info const&, edm::EDProductGetter const*) const [clone .cold] ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libDataFormatsCommon.so
#3 0x00007fffa557aa1a in reco::Track::recHitsBegin() const ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/pluginRecoTrackerFinalTrackSelectorsPlugins.so
#4 0x00007fffa55bd779 in SingleLongTrackProducer::produce(edm::Event&, edm::EventSetup const&) ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/pluginRecoTrackerFinalTrackSelectorsPlugins.so
#5 0x00007ffff7e483c1 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libFWCoreFramework.so
#6 0x00007ffff7e2c04e in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libFWCoreFramework.so
#7 0x00007ffff7db9159 in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libFWCoreFramework.so
#8 0x00007ffff7db96c4 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libFWCoreFramework.so
#9 0x00007ffff7f3af28 in tbb::detail::d1::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) ()
from /cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el9_amd64_gcc12/libFWCoreConcurrency.so
#10 0x00007ffff6f1091b in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x7ffeafe74400, waiter=..., this=0x7ffff41c3b00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#11 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x7ffff41c3b00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#12 tbb::detail::r1::arena::process (tls=..., this=<optimized out>)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/arena.cpp:137
#13 tbb::detail::r1::market::process (this=<optimized out>, j=...)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/market.cpp:599
#14 0x00007ffff6f12ace in tbb::detail::r1::rml::private_worker::run (this=0x7ffff2486f00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/private_server.cpp:271
#15 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7ffff2486f00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_0_3-el9_amd64_gcc12/build/CMSSW_14_0_3-build/BUILD/el9_amd64_gcc12/external/tbb/v2021.9.0-d33db04d4520c6ff791eab900054e986/tbb-v2021.9.0/src/tbb/private_server.cpp:221
#16 0x00007ffff5a89c02 in start_thread () from /lib64/libc.so.6
#17 0x00007ffff5b0ec40 in clone3 () from /lib64/libc.so.6
Which is caught here https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/RecoTracker/FinalTrackSelectors/plugins/SingleLongTrackProducer.cc#L158-L173
which is problematic as the tracks are the generalTracks which are being made in this job and SHOULD have accessible hits!
assign tracking
The next group of exceptions come from
#0 0x00007ffff5b9d2f1 in __cxxabiv1::__cxa_throw (obj=0x7ffdca082400, tinfo=0x7ffff79a5628 <typeinfo for cms::Exception>, dest=0x7ffff796ee30 <cms::Exception::~Exception()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:81
#1 0x00007fffc37f8a8d in PerigeeConversions::ftsToPerigeeParameters(FreeTrajectoryState const&, Point3DBase<float, GlobalTag> const&, double&) [clone .cold] ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libTrackingToolsTrajectoryState.so
#2 0x00007fffc3806a5a in TrajectoryStateClosestToPoint::TrajectoryStateClosestToPoint(FreeTrajectoryState const&, Point3DBase<float, GlobalTag> const&) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libTrackingToolsTrajectoryState.so
#3 0x00007fffc38725a5 in TSCPBuilderNoMaterial::operator()(TrajectoryStateOnSurface const&, Point3DBase<float, GlobalTag> const&) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libTrackingToolsPatternTools.so
#4 0x00007fffbe679dd2 in PerigeeLinearizedTrackState::computeJacobians() const () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexVertexTools.so
#5 0x00007fffbe67a456 in PerigeeLinearizedTrackState::isValid() const () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexVertexTools.so
#6 0x00007fffbc5ac58f in KalmanVertexUpdator<5u>::positionUpdate(VertexState const&, ReferenceCountingPointer<LinearizedTrackState<5u> >, float, int) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#7 0x00007fffbc5ae20d in KalmanVertexUpdator<5u>::update(CachingVertex<5u> const&, ReferenceCountingPointer<VertexTrack<5u> >, float, int) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#8 0x00007fffbc5ae89a in KalmanVertexUpdator<5u>::add(CachingVertex<5u> const&, ReferenceCountingPointer<VertexTrack<5u> >) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#9 0x00007fffbc5ae90d in KalmanVertexTrackCompatibilityEstimator<5u>::estimateNFittedTrack(CachingVertex<5u> const&, ReferenceCountingPointer<VertexTrack<5u> >) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#10 0x00007fffbc5b023f in KalmanVertexTrackCompatibilityEstimator<5u>::estimate(CachingVertex<5u> const&, ReferenceCountingPointer<VertexTrack<5u> >, unsigned int) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#11 0x00007fffbc5aa80e in KalmanVertexTrackCompatibilityEstimator<5u>::estimate(CachingVertex<5u> const&, ReferenceCountingPointer<LinearizedTrackState<5u> >, unsigned int) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexKalmanVertexFit.so
#12 0x00007fffbc5d101c in AdaptiveVertexFitter::reWeightTracks(std::vector<ReferenceCountingPointer<LinearizedTrackState<5u> >, std::allocator<ReferenceCountingPointer<LinearizedTrackState<5u> > > > const&, CachingVertex<5u> const&) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexAdaptiveVertexFit.so
#13 0x00007fffbc5d1e65 in AdaptiveVertexFitter::reWeightTracks(std::vector<ReferenceCountingPointer<VertexTrack<5u> >, std::allocator<ReferenceCountingPointer<VertexTrack<5u> > > > const&, CachingVertex<5u> const&) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexAdaptiveVertexFit.so
#14 0x00007fffbc5d32ed in AdaptiveVertexFitter::fit(std::vector<ReferenceCountingPointer<VertexTrack<5u> >, std::allocator<ReferenceCountingPointer<VertexTrack<5u> > > > const&, VertexState const&, bool) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexAdaptiveVertexFit.so
#15 0x00007fffbc5d46e1 in AdaptiveVertexFitter::vertex(std::vector<reco::TransientTrack, std::allocator<reco::TransientTrack> > const&, Point3DBase<float, GlobalTag> const&) const ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libRecoVertexAdaptiveVertexFit.so
#16 0x00007fff4035710a in TemplatedInclusiveVertexFinder<edm::View<reco::Candidate>, reco::VertexCompositePtrCandidate>::produce(edm::Event&, edm::EventSetup const&) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/pluginRecoVertexAdaptiveVertexFinderPlugins.so
#17 0x00007ffff7ce1e91 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreFramework.so
the exception originates here
https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/TrackingTools/TrajectoryState/src/PerigeeConversions.cc#L15-L16
and is caught here
https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/TrackingTools/TrajectoryState/src/TrajectoryStateClosestToPoint.cc#L8-L23
assign reconstruction
New categories assigned: reconstruction
@jfernan2,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks
By skipping the first events, I was able to get to the trackback for the exception which ultimately ended the job
#0 0x00007ffff5b9d2f1 in __cxxabiv1::__cxa_throw (obj=0x7ffe9579d1a0, tinfo=0x7ffff5d03190 <typeinfo for std::length_error>, dest=0x7ffff5bb2220 <std::length_error::~length_error()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:81
#1 0x00007ffff5b942d9 in std::__throw_length_error(char const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/lib64/libstdc++.so.6
#2 0x00007fffc38c8346 in ROOT::Detail::TCollectionProxyInfo::Pushback<std::vector<unsigned char, std::allocator<unsigned char> > >::resize(void*, unsigned long) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libDataFormatsStdDictionaries.so
#3 0x00007ffff7193701 in void TGenCollectionStreamer::ReadBufferVectorPrimitives<unsigned char>(TBuffer&, void*, TClass const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#4 0x00007ffff7110e09 in TBufferFile::ReadFastArray(void*, TClass const*, int, TMemberStreamer*, TClass const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#5 0x00007ffff735e073 in int TStreamerInfo::ReadBuffer<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#6 0x00007ffff7211e4c in TStreamerInfoActions::VectorLooper::GenericRead(TBuffer&, void*, void const*, TStreamerInfoActions::TLoopConfiguration const*, TStreamerInfoActions::TConfiguration const*) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#7 0x00007ffff710f5fc in TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*, void*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#8 0x00007ffff725f38f in int TStreamerInfoActions::ReadSTL<&TStreamerInfoActions::ReadSTLMemberWiseSameClass, &TStreamerInfoActions::ReadSTLObjectWiseFastArray>(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#9 0x00007ffff7117eae in TBufferFile::ReadClassBuffer(TClass const*, void*, TClass const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#10 0x00007ffff735cdcc in int TStreamerInfo::ReadBuffer<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#11 0x00007ffff71de94d in TStreamerInfoActions::GenericReadAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#12 0x00007ffff710fbb5 in TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libRIO.so
#13 0x00007ffff7873b87 in TBranchElement::ReadLeavesMember(TBuffer&) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libTree.so
#14 0x00007ffff786c429 in TBranch::GetEntry(long long, int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libTree.so
#15 0x00007ffff787ed44 in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libTree.so
#16 0x00007ffff787ecfd in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/external/el8_amd64_gcc12/lib/libTree.so
#17 0x00007fff9d66585c in edm::RootTree::getEntry(TBranch*, long long) const () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/pluginIOPoolInput.so
#18 0x00007fff9d64639c in edm::RootDelayedReader::getProduct_(edm::BranchID const&, edm::EDProductGetter const*) () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/pluginIOPoolInput.so
#19 0x00007ffff7bc111f in edm::DelayedReader::getProduct(edm::BranchID const&, edm::EDProductGetter const*, edm::ModuleCallingContext const*) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreFramework.so
#20 0x00007ffff7c6a35b in edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreFramework.so
#21 0x00007ffff7c6b7cc in edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}::operator()() const () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreFramework.so
#22 0x00007ffff7c6b918 in edm::SerialTaskQueue::QueuedTask<edm::SerialTaskQueueChain::push<edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&>(tbb::detail::d1::task_group&, edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&)::{lambda()#1}>::execute() () from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreFramework.so
#23 0x00007ffff7e031d0 in tbb::detail::d1::function_task<edm::SerialTaskQueue::spawn(edm::SerialTaskQueue::TaskBase&)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) ()
from /cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_0_7/lib/el8_amd64_gcc12/libFWCoreConcurrency.so
#24 0x00007ffff63fe95b in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x7fff08c3ec00, waiter=..., this=0x7ffff3963b00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:322
#25 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x7ffff3963b00)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/task_dispatcher.h:458
#26 tbb::detail::r1::arena::process (tls=..., this=<optimized out>)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/arena.cpp:137
#27 tbb::detail::r1::market::process (this=<optimized out>, j=...)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/market.cpp:599
#28 0x00007ffff6400b0e in tbb::detail::r1::rml::private_worker::run (this=0x7ffff17e9100)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/private_server.cpp:271
#29 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x7ffff17e9100)
at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_14_1_0_pre1-el8_amd64_gcc12/build/CMSSW_14_1_0_pre1-build/BUILD/el8_amd64_gcc12/external/tbb/v2021.9.0-c3903c50b52342174dbd3a52854a6e6d/tbb-v2021.9.0/src/tbb/private_server.cpp:221
#30 0x00007ffff55341ca in start_thread () from /lib64/libpthread.so.0
#31 0x00007ffff518f8d3 in clone () from /lib64/libc.so.6
assign root
@pcanal how can we understand better what happened during the read?
type root
type tracking
Which is caught here
https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/RecoTracker/FinalTrackSelectors/plugins/SingleLongTrackProducer.cc#L158-L173
that's just looks like a poorly written code, where try/catch is used instead of checking for trackExtra to be present. Tracks are apparently not pure generalTracks, see https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/RecoTracker/FinalTrackSelectors/plugins/SingleLongTrackProducer.cc#L133-L136
a proper copy is made conditionally, while the rest in selTracks is going to be default-constructed reco::Tracks
@borzari
please check https://github.com/cms-sw/cmssw/issues/45162#issuecomment-2153549462
to possibly remove the try/catch pattern related to just acces to track.extra in the track.recHitsBegin() call.
It should be a combination of validity checks for extra() and then extra()->recHitsProduct(); by checking isNonnull() && isAvailable() for each, sequentially.
This could even be packed into a new helper method ,e.g. bool reco::Track::recHitsOk()
Please clarify if you are available to check this. Thank you.
Hi @slava77
I applied what you suggested in this commit, used the opportunity to remove some duplicated code, and tested it with RelValZMM and RelValTTbar events by comparing the version with try/catch results with the version with the validity check results. Everything worked as intended and no changes to the output were observed, as expected.
Just to clarify two points:
- I added a method inside the
SingleLongTrackProducermodule to check the validity of the track. Thinking out loud about what you suggested, I think you meant that the method could be included in https://github.com/cms-sw/cmssw/blob/master/DataFormats/TrackReco/interface/Track.h. If this is what you meant, I can modify the branch to have therecHitsOkmethod there; - I couldn't check the validity of the
recHitsProduct(). There doesn't seem to be something similar toisNonnull()orisAvailable()for it. However, just checkingtrack.extra()seemed enough. Was it supposed to be like this? Am I missing something about therecHitsProduct()?
I couldn't check the validity of the
recHitsProduct(). There doesn't seem to be something similar toisNonnull()orisAvailable()for it. However, just checkingtrack.extra()seemed enough. Was it supposed to be like this? Am I missing something about therecHitsProduct()?
I misread the TrackExtraBase; edm::RefCore m_hitCollection; is the one that has isNonnull() and isAvailable(), but it is not publicly exposed.
So, I would add this bool recHitsOk() const {return m_hitCollection.isNonnull() && m_hitCollection.isAvailable();} in TrackExtraBase.h
And then in Track.h bool recHitsOk() const {return extra_.isNonnull() && extra_.isAvailable() && extra_->recHitsOk();}
Even though in the current setup a track without an extra is enough, there can still be cases where SingleLongTrackProducer uses input tracks where hits got dropped.
Tracks are apparently not pure generalTracks, see https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/RecoTracker/FinalTrackSelectors/plugins/SingleLongTrackProducer.cc#L133-L136 a proper copy is made conditionally, while the rest in
selTracksis going to be default-constructedreco::Tracks
Out of curiosity why is that? Can't the selTracks just contain the tracks we can actually refit?
Tracks are apparently not pure generalTracks, see https://github.com/cms-sw/cmssw/blob/dbbd44f6792e61b79f46b7f9974eec7cf8e3024b/RecoTracker/FinalTrackSelectors/plugins/SingleLongTrackProducer.cc#L133-L136
a proper copy is made conditionally, while the rest in
selTracksis going to be default-constructedreco::TracksOut of curiosity why is that? Can't the
selTracksjust contain the tracks we can actually refit?
Hi @mmusich
The selTracks collection will only have one track, the one with smallest chiNdof. I also want to check if the rechits and hits from the hitpattern are valid to say that it is a goodTrack that can be used for the shortened tracks pT resolution. Specially because of what @slava77 mentioned here:
Even though in the current setup a track without an extra is enough, there can still be cases where SingleLongTrackProducer uses input tracks where hits got dropped.
The hit checks are to make sure that this track won't have missing layers with measurement, which is not 100% effective as I already showed during the presentations about this topic, but also doesn't impact a lot on the final result because it doesn't happen so often. I wouldn't think changing that part of the code for selTracks to only have tracks that can be refitted to have a large impact on what is going on in the SingleLongTrackProducer or after it, unless it is an extra "safety check" that can be included.
Here I added the suggestions from @slava77. Again, I tested with RelValZMM and RelValTTbar events, and things are working as expected. If you don't have other suggestions, I can open a PR with it and we can continue the discussion there
@borzari
also want to check if the rechits and hits from the hitpattern are valid to say that it is a goodTrack that can be used for the shortened tracks pT resolution.
Exactly, can't you do that before filling the vector? Default constructed tracks can't be used for refit.
@borzari
also want to check if the rechits and hits from the hitpattern are valid to say that it is a goodTrack that can be used for the shortened tracks pT resolution.
Exactly, can't you do that before filling the vector? Default constructed tracks can't be used for refit.
Alright, so instead of only getting the track with the smallest chiNdof, I also want it to have recHitsOk(), right?
I also want it to have
recHitsOk(), right?
Right, this is what I had in mind.
I also want it to have
recHitsOk(), right?Right, this is what I had in mind.
It didn't work. If I move the validity check from the rechits/hitpattern check to where I select tracks (I did if (chiNdof < fitProb && track.recHitsOk())), I get the message as if I was not checking the tracks:
----- Begin Fatal Exception 08-Jun-2024 19:16:37 CEST-----------------------
An exception of category 'InvalidReference' occurred while
[0] Processing Event run: 1 lumi: 76 event: 7503 stream: 6
[1] Running path 'dqmoffline_step'
[2] Calling method for module SingleLongTrackProducer/'SingleLongTrackProducer'
Exception Message:
BadRefCore RefCore: Request to resolve a null or invalid reference to a product of type 'std::vector<reco::TrackExtra>' has been detected.
Please modify the calling code to test validity before dereferencing.
----- End Fatal Exception -------------------------------------------------
I get the message as if I was not checking the tracks:
Isn't track.recHitsOk() checking that the TrackExtra is valid?
Isn't
track.recHitsOk()checking that theTrackExtrais valid?
Should be. I implemented it like Slava suggested here
Could it be that, although I am adding only tracks with valid TrackExtra to selTracks, the framework still needs me to check if I am looking at a valid track (that have TrackExtra) from it to check if it has valid hits/hitpattern? I am not sure how the "not valid TrackExtra" exception works, that is why I am asking
The check I used was
if (track.extra().isAvailable()) {
The check I used was
if (track.extra().isAvailable()) {
Alright @Dr15Jones, but does it happens every time I am using a reco::Track anywhere?
Well, in any case, I would suggest to open a PR with these changes. At least to remove the try/catch pattern.
get the message as if I was not checking the tracks:
maybe I am missing something, but with https://github.com/CMSTrackingPOG/cmssw/commit/53185493eae82d7fe8e807e9b266491ea51d06f8 on top of https://github.com/borzari/cmssw/commit/95ecc4bb4aa7e811f1f65025c8f08a23a72cf272 I can run this test:
https://github.com/cms-sw/cmssw/blob/4639c105a21c6934798c543c9a7cae72955c9369/DQM/TrackingMonitorSource/test/BuildFile.xml#L2
(even using the whole input file) without crashes.
@mmusich most probably I was missing something. The main differences I see (besides the better organization of the code in the way you wrote), is that I included track.recHitsOk() here in the condition to select the best track, and instead of using isNonnull() here, I would use the bestTrack.recHitsOk(). Also, and maybe here was my mistake, I removed this condition, which you didn't. That is why I asked @Dr15Jones if the check for the availability for TrackExtra is done every time a reco::Track is being used
@mmusich I started from your branch and tested what I mentioned above:
- Replaced
if (bestTrack.extra().isNonnull())withif (bestTrack.recHitsOk()): didn't have any effect, as expected, and should be "safer" - Removed the extra check from here, and it also didn't failed, as I was thinking. I really don't know why that is the case and what is different from what I did, except for adding the check together with the chi2ndof condition to fill
selTracks; I would also keep the extra check for safety reasons
May I start a PR to include your changes and the recHitsOk() method to CMSSW?