test failed replication_test.py::test_take_over_counters
investigate https://github.com/dragonflydb/dragonfly/actions/runs/10993785520/job/30521012937
@adiholden do not assign this to @chakaz. It happens during shutdown and I think it's related to the one I am fixing
whoops was a hard crash. I did not see the traces
It's been a while since we opened this, perhaps it was already resolved. Let's optimistically close it, and reopen if needed.
@chakaz I am pretty sure it was not, I saw it two week ago again.
Indeed, also just now: https://github.com/dragonflydb/dragonfly/actions/runs/11763776680/job/32768184396#step:6:1067
succeeded to reproduce it. Got this stackl trace file:
#0 __pthread_kill_implementation (threadid=279096101982240, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
#1 0x0000fdd61f4c7690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x0000fdd61f47cb3c in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#3 0x0000b8c7e19c3730 in absl::lts_20240116::RaiseToDefaultHandler (signo=6)
at /home/dev/projects/dragonfly/build-dbg/_deps/abseil_cpp-src/absl/debugging/failure_signal_handler.cc:77
#4 0x0000b8c7e19c4268 in absl::lts_20240116::AbslFailureSignalHandler (signo=6, ucontext=0xfdd61fe8ee20)
at /home/dev/projects/dragonfly/build-dbg/_deps/abseil_cpp-src/absl/debugging/failure_signal_handler.cc:393
#5 <signal handler called>
#6 __pthread_kill_implementation (threadid=279096101982240, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
#7 0x0000fdd61f4c7690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#8 0x0000fdd61f47cb3c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#9 0x0000fdd61f467e00 in __GI_abort () at ./stdlib/abort.c:79
#10 0x0000b8c7e19357f4 in glog_internal_namespace_::Mutex::Lock (this=0xb8c7e266fdd0 <google::log_mutex>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/base/mutex.h:272
#11 0x0000b8c7e19358c4 in glog_internal_namespace_::MutexLock::MutexLock (this=0xffffeb8061c0,
mu=0xb8c7e266fdd0 <google::log_mutex>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/base/mutex.h:290
#12 0x0000b8c7e1930b28 in google::LogMessage::Flush (this=0xffffeb806268)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/logging.cc:1779
#13 0x0000b8c7e19308d8 in google::LogMessage::~LogMessage (this=0xffffeb806268, __in_chrg=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/logging.cc:1727
#14 0x0000b8c7e1821c0c in util::fb2::(anonymous namespace)::SigAction (signal=15)
at /home/dev/projects/dragonfly/helio/util/fibers/proactor_base.cc:53
#15 <signal handler called>
#16 0x0000fdd61f526fcc in __GI_madvise () at ../sysdeps/unix/syscall-template.S:120
#17 0x0000b8c7e177d808 in unix_madvise (advice=4, size=size@entry=281474632807847, addr=addr@entry=0x568f8170000)
--Type <RET> for more, q to quit, c to continue without paging--
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/prim/unix/prim.c:188
#18 _mi_prim_decommit (start=start@entry=0x568f8170000, size=size@entry=32047104,
needs_recommit=needs_recommit@entry=0xffffeb8075a7)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/prim/unix/prim.c:401
#19 0x0000b8c7e1772f14 in mi_os_decommit_ex (addr=addr@entry=0x568f8170000, size=size@entry=32047104,
needs_recommit=needs_recommit@entry=0xffffeb8075a7, tld_stats=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:426
#20 0x0000b8c7e1774548 in _mi_os_purge_ex (stats=<optimized out>, allow_reset=true, size=32047104, p=0x568f8170000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:476
#21 _mi_os_purge_ex (stats=0xb8c7e26252c8 <tld_main+968>, allow_reset=true, size=32047104, p=0x568f8170000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:466
#22 _mi_os_purge (p=0x568f8170000, size=size@entry=32047104, stats=stats@entry=0xb8c7e26252c8 <tld_main+968>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:490
#23 0x0000b8c7e17789c8 in mi_segment_purge (segment=segment@entry=0x568f8000000, p=p@entry=0x568f8170000 "",
size=size@entry=32047104, stats=stats@entry=0xb8c7e26252c8 <tld_main+968>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:522
#24 0x0000b8c7e1779b50 in mi_segment_purge (stats=0xb8c7e26252c8 <tld_main+968>, size=32047104, p=0x568f8170000 "",
segment=0x568f8000000) at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:509
#25 mi_segment_try_purge (segment=segment@entry=0x568f8000000, stats=0xb8c7e26252c8 <tld_main+968>, force=true)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:592
#26 0x0000b8c7e177b1e8 in mi_segment_try_purge (stats=<optimized out>, force=true, segment=0x568f8000000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:603
#27 0x0000b8c7e176f74c in mi_heap_page_collect (arg2=0x0, arg_collect=<synthetic pointer>, page=0x568f8000948,
pq=0xb8c7e2624888 <_mi_heap_main+1408>, heap=0xb8c7e2624308 <_mi_heap_main>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:101
#28 mi_heap_visit_pages (arg2=0x0, arg1=<synthetic pointer>, fn=<optimized out>,
heap=0xb8c7e2624308 <_mi_heap_main>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:46
--Type <RET> for more, q to quit, c to continue without paging--
#29 mi_heap_collect_ex (heap=0xb8c7e2624308 <_mi_heap_main>, collect=collect@entry=MI_FORCE)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:159
#30 0x0000b8c7e176f8a8 in mi_heap_collect (force=force@entry=true, heap=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:180
#31 0x0000b8c7e17703f0 in mi_process_done ()
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/init.c:638
#32 mi_process_done () at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/init.c:622
#33 0x0000fdd61f47f228 in __run_exit_handlers (status=0, listp=0xfdd61f5f0670 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:108
#34 0x0000fdd61f47f30c in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:138
#35 0x0000fdd61f4684c8 in __libc_start_call_main (main=main@entry=0xb8c7e0d0e404 <main(int, char**)>,
argc=argc@entry=12, argv=argv@entry=0xffffeb8079d8) at ../sysdeps/nptl/libc_start_call_main.h:74
#36 0x0000fdd61f468598 in __libc_start_main_impl (main=0xb8c7e0d0e404 <main(int, char**)>, argc=12,
argv=0xffffeb8079d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=<optimized out>) at ../csu/libc-start.c:360
#37 0x0000b8c7e0d094b0 in _start ()
(gdb) bt
#0 __pthread_kill_implementation (threadid=279096101982240, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
#1 0x0000fdd61f4c7690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x0000fdd61f47cb3c in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#3 0x0000b8c7e19c3730 in absl::lts_20240116::RaiseToDefaultHandler (signo=6)
at /home/dev/projects/dragonfly/build-dbg/_deps/abseil_cpp-src/absl/debugging/failure_signal_handler.cc:77
#4 0x0000b8c7e19c4268 in absl::lts_20240116::AbslFailureSignalHandler (signo=6, ucontext=0xfdd61fe8ee20)
at /home/dev/projects/dragonfly/build-dbg/_deps/abseil_cpp-src/absl/debugging/failure_signal_handler.cc:393
#5 <signal handler called>
#6 __pthread_kill_implementation (threadid=279096101982240, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
#7 0x0000fdd61f4c7690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#8 0x0000fdd61f47cb3c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#9 0x0000fdd61f467e00 in __GI_abort () at ./stdlib/abort.c:79
#10 0x0000b8c7e19357f4 in glog_internal_namespace_::Mutex::Lock (this=0xb8c7e266fdd0 <google::log_mutex>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/base/mutex.h:272
#11 0x0000b8c7e19358c4 in glog_internal_namespace_::MutexLock::MutexLock (this=0xffffeb8061c0,
mu=0xb8c7e266fdd0 <google::log_mutex>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/base/mutex.h:290
#12 0x0000b8c7e1930b28 in google::LogMessage::Flush (this=0xffffeb806268)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/logging.cc:1779
#13 0x0000b8c7e19308d8 in google::LogMessage::~LogMessage (this=0xffffeb806268, __in_chrg=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/_deps/glog-src/src/logging.cc:1727
#14 0x0000b8c7e1821c0c in util::fb2::(anonymous namespace)::SigAction (signal=15)
at /home/dev/projects/dragonfly/helio/util/fibers/proactor_base.cc:53
#15 <signal handler called>
#16 0x0000fdd61f526fcc in __GI_madvise () at ../sysdeps/unix/syscall-template.S:120
#17 0x0000b8c7e177d808 in unix_madvise (advice=4, size=size@entry=281474632807847, addr=addr@entry=0x568f8170000)
--Type <RET> for more, q to quit, c to continue without paging--
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/prim/unix/prim.c:188
#18 _mi_prim_decommit (start=start@entry=0x568f8170000, size=size@entry=32047104,
needs_recommit=needs_recommit@entry=0xffffeb8075a7)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/prim/unix/prim.c:401
#19 0x0000b8c7e1772f14 in mi_os_decommit_ex (addr=addr@entry=0x568f8170000, size=size@entry=32047104,
needs_recommit=needs_recommit@entry=0xffffeb8075a7, tld_stats=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:426
#20 0x0000b8c7e1774548 in _mi_os_purge_ex (stats=<optimized out>, allow_reset=true, size=32047104, p=0x568f8170000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:476
#21 _mi_os_purge_ex (stats=0xb8c7e26252c8 <tld_main+968>, allow_reset=true, size=32047104, p=0x568f8170000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:466
#22 _mi_os_purge (p=0x568f8170000, size=size@entry=32047104, stats=stats@entry=0xb8c7e26252c8 <tld_main+968>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/os.c:490
#23 0x0000b8c7e17789c8 in mi_segment_purge (segment=segment@entry=0x568f8000000, p=p@entry=0x568f8170000 "",
size=size@entry=32047104, stats=stats@entry=0xb8c7e26252c8 <tld_main+968>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:522
#24 0x0000b8c7e1779b50 in mi_segment_purge (stats=0xb8c7e26252c8 <tld_main+968>, size=32047104, p=0x568f8170000 "",
segment=0x568f8000000) at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:509
#25 mi_segment_try_purge (segment=segment@entry=0x568f8000000, stats=0xb8c7e26252c8 <tld_main+968>, force=true)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:592
#26 0x0000b8c7e177b1e8 in mi_segment_try_purge (stats=<optimized out>, force=true, segment=0x568f8000000)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/segment.c:603
#27 0x0000b8c7e176f74c in mi_heap_page_collect (arg2=0x0, arg_collect=<synthetic pointer>, page=0x568f8000948,
pq=0xb8c7e2624888 <_mi_heap_main+1408>, heap=0xb8c7e2624308 <_mi_heap_main>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:101
#28 mi_heap_visit_pages (arg2=0x0, arg1=<synthetic pointer>, fn=<optimized out>,
heap=0xb8c7e2624308 <_mi_heap_main>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:46
--Type <RET> for more, q to quit, c to continue without paging--
#29 mi_heap_collect_ex (heap=0xb8c7e2624308 <_mi_heap_main>, collect=collect@entry=MI_FORCE)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:159
#30 0x0000b8c7e176f8a8 in mi_heap_collect (force=force@entry=true, heap=<optimized out>)
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/heap.c:180
#31 0x0000b8c7e17703f0 in mi_process_done ()
at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/init.c:638
#32 mi_process_done () at /home/dev/projects/dragonfly/build-dbg/third_party/mimalloc/src/init.c:622
#33 0x0000fdd61f47f228 in __run_exit_handlers (status=0, listp=0xfdd61f5f0670 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:108
#34 0x0000fdd61f47f30c in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:138
#35 0x0000fdd61f4684c8 in __libc_start_call_main (main=main@entry=0xb8c7e0d0e404 <main(int, char**)>,
argc=argc@entry=12, argv=argv@entry=0xffffeb8079d8) at ../sysdeps/nptl/libc_start_call_main.h:74
#36 0x0000fdd61f468598 in __libc_start_main_impl (main=0xb8c7e0d0e404 <main(int, char**)>, argc=12,
argv=0xffffeb8079d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=<optimized out>) at ../csu/libc-start.c:360
#37 0x0000b8c7e0d094b0 in _start ()
Opened https://github.com/romange/helio/pull/343/files though this is not the root cause.
https://github.com/dragonflydb/dragonfly/actions/runs/11977784310/job/33396431268#step:6:952