dragonfly
dragonfly copied to clipboard
Deadlock in ZSetFamilyTest.BlockingTimeout
https://github.com/dragonflydb/dragonfly/actions/runs/6861038546/job/18655923970
53: [ RUN ] ZSetFamilyTest.BlockingTimeout 53: E20231114 08:55:27.455155 25789 test_utils.cc:225] Deadlock detected!!!! 53: E20231114 08:55:27.455852 25789 test_utils.cc:230] Proactor 2: 53: I20231114 08:55:27.455909 25789 scheduler.cc:713] ------------ Fiber main (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac18436 util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x55934abcfc8a util::fb2::ProactorBase::Run() 53: I20231114 08:55:27.505266 25789 scheduler.cc:713] ------------ Fiber _dispatch (ready) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ad66712 util::fb2::UringProactor::MainLoop() 53: 0x55934abd4e92 util::fb2::ProactorDispatcher::Run() 53: 0x55934ac12d61 util::fb2::detail::(anonymous namespace)::DispatcherImpl::Run() 53: 0x55934ac11b15 util::fb2::detail::(anonymous namespace)::DispatcherImpl::DispatcherImpl()::{lambda()#1}::operator()() 53: 0x55934ac222f2 std::__invoke_impl<>() 53: I20231114 08:55:27.546581 25789 scheduler.cc:713] ------------ Fiber Watchdog (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac1870f util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x559348eaf720 util::fb2::EventCount::wait() 53: 0x559348fef901 util::fb2::EventCount::await<>() 53: 0x559348fe929d util::fb2::BlockingCounter::Impl::Wait() 53: 0x559348fe96fb util::fb2::BlockingCounter::Wait() 53: 0x559348fa96c6 util::ProactorPool::AwaitFiberOnAll<>() 53: 0x559348f9d23e dfly::BaseFamilyTest::ResetService()::{lambda()#2}::operator()() 53: 0x559348fd06d4 std::__invoke_impl<>() 53: 0x559348fcab7b std::__invoke<>() 53: 0x559348fc40d7 std::__apply_impl<>() 53: 0x559348fc414e std::apply<>() 53: I20231114 08:55:27.631217 25789 scheduler.cc:713] ------------ Fiber (active) ------------ 53: 0x55934ac1f7ce util::fb2::detail::Scheduler::ExecuteOnAllFiberStacks() 53: 0x55934ac207d1 util::fb2::detail::Scheduler::PrintAllFiberStackTraces() 53: 0x55934ac39506 util::fb2::detail::FiberInterface::PrintAllFiberStackTraces() 53: 0x559348f9c0c0 dfly::BaseFamilyTest::ResetService()::{lambda()#2}::operator()()::{lambda()#1}::operator()() 53: 0x559348fa92cc util::ProactorPool::AwaitFiberOnAll<>()::{lambda()#1}::operator()() 53: 0x559348fd3db9 std::__invoke_impl<>() 53: 0x559348fd0243 std::__invoke<>() 53: 0x559348fc9fa5 std::_apply_impl<>() 53: 0x559348fca030 std::apply<>() 53: 0x559348fca233 util::fb2::detail::WorkerFiberImpl<>::run() 53: 0x559348fc3913 util::fb2::detail::WorkerFiberImpl<>::WorkerFiberImpl<>()::{lambda()#1}::operator()() 53: 0x559348fe33f3 std::__invoke_impl<>() 53: E20231114 08:55:27.728854 25787 test_utils.cc:230] Proactor 0: 53: I20231114 08:55:27.728955 25787 scheduler.cc:700] Sleep queue size 1 53: I20231114 08:55:27.728987 25787 scheduler.cc:713] ------------ Fiber main (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac18436 util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x55934abcfc8a util::fb2::ProactorBase::Run() 53: I20231114 08:55:27.729032 25787 scheduler.cc:713] ------------ Fiber _dispatch (ready) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ad66712 util::fb2::UringProactor::MainLoop() 53: 0x55934abd4e92 util::fb2::ProactorDispatcher::Run() 53: 0x55934ac12d61 util::fb2::detail::(anonymous namespace)::DispatcherImpl::Run() 53: 0x55934ac11b15 util::fb2::detail::(anonymous namespace)::DispatcherImpl::DispatcherImpl()::{lambda()#1}::operator()() 53: 0x55934ac222f2 std::__invoke_impl<>() 53: I20231114 08:55:27.729084 25787 scheduler.cc:713] ------------ Fiber shard_queue0 (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac1870f util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x559348eaf720 util::fb2::EventCount::wait() 53: 0x55934ad419cf util::fb2::EventCount::await<>() 53: 0x55934ad40113 util::fb2::FiberQueue::Run() 53: 0x559349039cb8 dfly::EngineShard::EngineShard()::{lambda()#2}::operator()() 53: 0x55934905239e std::__invoke_impl<>() 53: 0x55934904fb62 std::__invoke<>() 53: 0x55934904db7e std::_apply_impl<>() 53: 0x55934904dbf5 std::apply<>() 53: 0x55934904dde5 util::fb2::detail::WorkerFiberImpl<>::run() 53: 0x55934904b285 util::fb2::detail::WorkerFiberImpl<>::WorkerFiberImpl<>()::{lambda()#1}::operator()() 53: I20231114 08:55:27.805414 25787 scheduler.cc:713] ------------ Fiber (sleeping until 1653020645721 now is 1652362959190) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac1870f util::fb2::detail::Scheduler::Preempt() 53: 0x55934ac1c69a util::fb2::detail::Scheduler::WaitUntil() 53: 0x559348fe87a1 util::fb2::detail::FiberInterface::WaitUntil() 53: 0x55934abf2214 util::fb2::EventCount::wait_until() 53: 0x55934a1232a2 util::fb2::EventCount::await_until<>() 53: 0x55934a1192df dfly::Transaction::WaitOnWatch() 53: 0x55934999a304 dfly::container_utils::RunCbOnFirstNonEmptyBlockingabi:cxx11 53: 0x5593498ebedd dfly::(anonymous namespace)::BZPopMinMax() 53: 0x5593498f64a7 dfly::ZSetFamily::BZPopMin() 53: 0x55934950fd71 fu2::abi_400::detail::invocation::invoke<>() 53: 0x5593494fb286 fu2::abi_400::detail::type_erasure::invocation_table::function_trait<>::internal_invoker<>::invoke() 53: 0x559349085c67 fu2::abi_400::detail::type_erasure::tables::vtable<>::invoke<>() 53: I20231114 08:55:27.896040 25787 scheduler.cc:713] ------------ Fiber (active) ------------ 53: 0x55934ac1f7ce util::fb2::detail::Scheduler::ExecuteOnAllFiberStacks() 53: 0x55934ac207d1 util::fb2::detail::Scheduler::PrintAllFiberStackTraces() 53: 0x55934ac39506 util::fb2::detail::FiberInterface::PrintAllFiberStackTraces() 53: 0x559348f9c0c0 dfly::BaseFamilyTest::ResetService()::{lambda()#2}::operator()()::{lambda()#1}::operator()() 53: 0x559348fa92cc util::ProactorPool::AwaitFiberOnAll<>()::{lambda()#1}::operator()() 53: 0x559348fd3db9 std::__invoke_impl<>() 53: 0x559348fd0243 std::__invoke<>() 53: 0x559348fc9fa5 std::_apply_impl<>() 53: 0x559348fca030 std::apply<>() 53: 0x559348fca233 util::fb2::detail::WorkerFiberImpl<>::run() 53: 0x559348fc3913 util::fb2::detail::WorkerFiberImpl<>::WorkerFiberImpl<>()::{lambda()#1}::operator()() 53: 0x559348fe33f3 std::__invoke_impl<>() 53: E20231114 08:55:27.896135 25787 test_utils.cc:250] TxLocks for shard 0 53: E20231114 08:55:27.896338 25788 test_utils.cc:230] Proactor 1: 53: I20231114 08:55:27.896440 25788 scheduler.cc:713] ------------ Fiber main (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac18436 util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x55934abcfc8a util::fb2::ProactorBase::Run() 53: I20231114 08:55:27.896490 25788 scheduler.cc:713] ------------ Fiber _dispatch (ready) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ad66712 util::fb2::UringProactor::MainLoop() 53: 0x55934abd4e92 util::fb2::ProactorDispatcher::Run() 53: 0x55934ac12d61 util::fb2::detail::(anonymous namespace)::DispatcherImpl::Run() 53: 0x55934ac11b15 util::fb2::detail::(anonymous namespace)::DispatcherImpl::DispatcherImpl()::{lambda()#1}::operator()() 53: 0x55934ac222f2 std::__invoke_impl<>() 53: I20231114 08:55:27.896548 25788 scheduler.cc:713] ------------ Fiber shard_queue1 (suspended) ------------ 53: 0x55934ac3a3f8 boost::context::detail::fiber_ontop<>() 53: 0x55934ac39c42 boost::context::fiber::resume_with<>() 53: 0x55934ac38f97 util::fb2::detail::FiberInterface::SwitchTo() 53: 0x55934ac1870f util::fb2::detail::Scheduler::Preempt() 53: 0x559348ead9d1 util::fb2::detail::FiberInterface::Suspend() 53: 0x559348eaf720 util::fb2::EventCount::wait() 53: 0x55934ad419cf util::fb2::EventCount::await<>() 53: 0x55934ad40113 util::fb2::FiberQueue::Run() 53: 0x559349039cb8 dfly::EngineShard::EngineShard()::{lambda()#2}::operator()() 53: 0x55934905239e std::__invoke_impl<>() 53: 0x55934904fb62 std::__invoke<>() 53: 0x55934904db7e std::_apply_impl<>() 53: 0x55934904dbf5 std::apply<>() 53: 0x55934904dde5 util::fb2::detail::WorkerFiberImpl<>::run() 53: 0x55934904b285 util::fb2::detail::WorkerFiberImpl<>::WorkerFiberImpl<>()::{lambda()#1}::operator()() 53: I20231114 08:55:27.896639 25788 scheduler.cc:713] ------------ Fiber (active) ------------ 53: 0x55934ac1f7ce util::fb2::detail::Scheduler::ExecuteOnAllFiberStacks() 53: 0x55934ac207d1 util::fb2::detail::Scheduler::PrintAllFiberStackTraces() 53: 0x55934ac39506 util::fb2::detail::FiberInterface::PrintAllFiberStackTraces() 53: 0x559348f9c0c0 dfly::BaseFamilyTest::ResetService()::{lambda()#2}::operator()()::{lambda()#1}::operator()() 53: 0x559348fa92cc util::ProactorPool::AwaitFiberOnAll<>()::{lambda()#1}::operator()() 53: 0x559348fd3db9 std::__invoke_impl<>() 53: 0x559348fd0243 std::__invoke<>() 53: 0x559348fc9fa5 std::_apply_impl<>() 53: 0x559348fca030 std::apply<>() 53: 0x559348fca233 util::fb2::detail::WorkerFiberImpl<>::run() 53: 0x559348fc3913 util::fb2::detail::WorkerFiberImpl<>::WorkerFiberImpl<>()::{lambda()#1}::operator()() 53: 0x559348fe33f3 std::__invoke_impl<>() 53: E20231114 08:55:27.896684 25788 test_utils.cc:250] TxLocks for shard 1 53: E20231114 08:55:27.896714 25788 test_utils.cc:252] Key zset1 {SHARED: 0, EXCLUSIVE: 1} 53: I20231114 08:55:28.463668 25787 zset_family_test.cc:669] BZPOPMIN 53: ../src/server/zset_family_test.cc:675: Failure 53: Expected: (AbsDuration(dur - absl::Milliseconds(1000))) < (absl::Milliseconds(300)), actual: 2m0.008912734s vs 300ms 53: I20231114 08:55:28.480441 25705 test_utils.cc:179] Finishing BlockingTimeout 53: [ FAILED ] ZSetFamilyTest.BlockingTimeout (121061 ms)
Have you updated helio in your branch? I submitted https://github.com/romange/helio/pull/168 which should have improved the stacktrace.
ah it is the new version! so this is what does not make sense:
53: I20231114 08:55:27.805414 25787 scheduler.cc:713] ------------ Fiber (sleeping until 1653020645721 now is 1652362959190)
we detect the deadlock after 120s:
08:53:27 GMT 53: [ RUN ] ZSetFamilyTest.BlockingTimeout
Tue, 14 Nov 2023 08:55:27 GMT 53: E20231114 08:55:27.455155 25789 test_utils.cc:225] Deadlock detected!!!!
but the fiber is scheduled to wake in 657686531 ns (until - now). It does not make sense since we should wait only 1 second.
and then later:
Expected: (AbsDuration(dur - absl::Milliseconds(1000))) < (absl::Milliseconds(300)), actual: 2m0.008912734s vs 300ms
which confirms that we waited for 120s.
I assume it's no longer relevant given we're rewritten transaction