otp icon indicating copy to clipboard operation
otp copied to clipboard

Coredump during beam shutdown

Open itssundeep opened this issue 2 years ago • 10 comments

Describe the bug We are seeing coredump during beam shutdown. We are hitting an assert in pthread_mutex_lock.

To Reproduce I have not a right way to reproduce this issue consistently, but opening the bug to see if you have a theory from the corefile.

Expected behavior No coredump during beam shutdown.

Affected versions We are seeing this issue with OTP 25-Rc3

Additional context Here is the backtrace from the coredump.

(gdb) bt
#0  __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:45
#1  __GI___pthread_kill (threadid=<optimized out>, signo=6) at pthread_kill.c:62
#2  0x00007fea8e55750d in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#3  0x00007fea8e53f433 in __GI_abort () at abort.c:79
#4  0x00007fea8e54ec48 in __assert_fail_base (fmt=0x7fea8e6f51d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7fea8e6f15ea "mutex->__data.__owner == 0", file=0x7fea8e6f15bf "pthread_mutex_lock.c", line=90, function=<optimized out>)
    at assert.c:92
#5  0x00007fea8e54ecb3 in __GI___assert_fail (assertion=0x7fea8e6f15ea "mutex->__data.__owner == 0", file=0x7fea8e6f15bf "pthread_mutex_lock.c", line=90, function=0x7fea8e6f9880 <__PRETTY_FUNCTION__.0> "___pthread_mutex_lock") at assert.c:101
#6  0x00007fea8e5b111c in ___pthread_mutex_lock (mutex=<optimized out>, mutex@entry=0x7fe027d18190) at pthread_mutex_lock.c:90
#7  0x000000000082ca61 in event_wait (mtxb=mtxb@entry=0x7fe027d18188, tse=tse@entry=0x7fe068310540, spincount=0, type=type@entry=1073741824, is_freq_read=is_freq_read@entry=0, is_rwmtx=1) at common/ethr_mutex.c:429
#8  0x000000000082e397 in write_lock_wait (is_rwmtx=1, is_freq_read=0, initial=<optimized out>, mtxb=0x7fe027d18188) at common/ethr_mutex.c:710
#9  rwmutex_normal_rwlock_wait (initial=<optimized out>, rwmtx=0x7fe027d18188) at common/ethr_mutex.c:2313
#10 ethr_rwmutex_rwlock (rwmtx=rwmtx@entry=0x7fe027d18188) at common/ethr_mutex.c:3033
#11 0x000000000061787e in erts_rwmtx_rwlock (rwmtx=0x7fe027d18188) at beam/erl_threads.h:2049
#12 erts_de_rwlock (dep=0x7fe027d18168) at beam/erl_node_tables.h:355
#13 abort_pending_connection (was_connected_p=0x0, conn_id=695150728, dep=0x7fe027d18168) at beam/dist.c:5307
#14 erts_do_net_exits (dep=<optimized out>, reason=<optimized out>) at beam/dist.c:940
#15 0x000000000046c4f7 in erts_continue_exit_process (p=0x17c36e8) at beam/erl_process.c:13975
#16 0x00007fe016cd9fe0 in ?? ()
#17 0x0000000000000000 in ?? ()

and as the assert indicates owner is not 0.

(gdb) p *rwmtx
$6 = {rwmtx = {mtxb = {flgs = {counter = 236638704}, aux_scnt = 32736, main_scnt = 0, qlck = {__data = {__lock = 1, __count = 0, __owner = 5, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x7fe027f9ff18, __next = 0x1fe55}},
        __size = "\001\000\000\000\000\000\000\000\005", '\000' <repeats 15 times>, "\030\377\371'\340\177\000\000U\376\001\000\000\000\000", __align = 1}, q = 0x7fe027d18000}, type = ETHR_RWMUTEX_TYPE_NORMAL, rq_end = 0x0, tdata = {ra = 0x3, rs = 3}}}

Is this is the right owner thread.

Thread 5 (LWP 1033):
#0  futex_wait (private=0, expected=2, futex_word=0x976760 <drv_ev_state+2176>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=0x976760 <drv_ev_state+2176>, private=0) at lowlevellock.c:50
#2  0x00007fea8e5b1051 in lll_mutex_lock_optimized (mutex=0x976760 <drv_ev_state+2176>) at pthread_mutex_lock.c:49
#3  ___pthread_mutex_lock (mutex=mutex@entry=0x976760 <drv_ev_state+2176>) at pthread_mutex_lock.c:89
#4  0x00000000007a803b in ethr_mutex_lock (mtx=0x976760 <drv_ev_state+2176>) at ../include/internal/ethr_mutex.h:655
#5  erts_mtx_lock (mtx=0x976760 <drv_ev_state+2176>) at beam/erl_threads.h:1733
#6  driver_select (ix=0x7fe1395a50f0, e=82594, mode=mode@entry=4, on=on@entry=0) at sys/common/erl_check_io.c:840
#7  0x00000000007c0d53 in desc_close (desc=0x7fe13952dd20) at drivers/common/inet_drv.c:4764
#8  erl_inet_close (desc=desc@entry=0x7fe13952dd20) at drivers/common/inet_drv.c:4802
#9  0x00000000007c1884 in inet_stop (desc=0x7fe13952dd20) at drivers/common/inet_drv.c:10185
#10 tcp_inet_stop (e=0x7fe13952dd20) at drivers/common/inet_drv.c:10185
#11 0x00000000005b3be7 in terminate_port (prt=prt@entry=0x7fe1395a50f0) at beam/io.c:3736
#12 0x00000000005b90ac in flush_port (p=<optimized out>) at beam/io.c:3682
#13 erts_deliver_port_exit (prt=prt@entry=0x7fe1395a50f0, from=from@entry=659638744035303, reason=<optimized out>, send_closed=send_closed@entry=0, drop_normal=drop_normal@entry=0) at beam/io.c:3931
#14 0x00000000005b983e in call_deliver_port_exit (broken_link=<optimized out>, reason=<optimized out>, state=<optimized out>, prt=0x7fe1395a50f0, from=659638744035303, bang_op=0) at beam/io.c:2178
#15 port_sig_exit (prt=0x7fe1395a50f0, state=<optimized out>, op=<optimized out>, sigdp=0x7fe06f8b7dc0) at beam/io.c:2202
#16 0x00000000005e9364 in erts_port_task_execute (runq=runq@entry=0x7fe06838a180, curr_port_pp=curr_port_pp@entry=0x7fe069e6efe0) at beam/erl_port_task.c:1796
#17 0x0000000000467bc4 in erts_schedule (esdp=<optimized out>, p=<optimized out>, calls=<optimized out>) at beam/erl_process.c:9788
#18 0x00007fe016cd9848 in ?? ()
#19 0x0000000000000000 in ?? ()

A lot of threads in the core are cleaning up dist connections.

Thread 11 (LWP 1020):
#0  0x00007fea8e64049a in epoll_ctl () at ../sysdeps/unix/syscall-template.S:120
#1  0x00000000007a345b in concurrent_update_pollset (events=0, op=ERTS_POLL_OP_DEL, fd=25880, ps=0x7fe0275fddc0) at sys/common/erl_poll.c:803
#2  poll_control (do_wake=<optimized out>, events=<optimized out>, op=<optimized out>, fd=25880, ps=0x7fe0275fddc0) at sys/common/erl_poll.c:1395
#3  erts_poll_control (ps=0x7fe0275fddc0, fd=fd@entry=25880, op=op@entry=ERTS_POLL_OP_DEL, events=events@entry=0, do_wake=do_wake@entry=0x7fe0149891cc) at sys/common/erl_poll.c:1441
#4  0x00000000007a853b in erts_io_control_wakeup (wake_poller=0x7fe0149891cc, pe=<optimized out>, op=ERTS_POLL_OP_DEL, state=0x7fe229871ef8) at sys/common/erl_check_io.c:444
#5  driver_select (ix=<optimized out>, e=25880, mode=7, mode@entry=4, on=on@entry=0) at sys/common/erl_check_io.c:954
#6  0x00000000007c0d53 in desc_close (desc=0x7fe083113488) at drivers/common/inet_drv.c:4764
#7  erl_inet_close (desc=desc@entry=0x7fe083113488) at drivers/common/inet_drv.c:4802
#8  0x00000000007c1884 in inet_stop (desc=0x7fe083113488) at drivers/common/inet_drv.c:10185
#9  tcp_inet_stop (e=0x7fe083113488) at drivers/common/inet_drv.c:10185
#10 0x00000000005b3be7 in terminate_port (prt=prt@entry=0x7fe083064788) at beam/io.c:3736
#11 0x00000000005b90ac in flush_port (p=<optimized out>) at beam/io.c:3682
#12 erts_deliver_port_exit (prt=prt@entry=0x7fe083064788, from=from@entry=686100088938503, reason=<optimized out>, send_closed=send_closed@entry=0, drop_normal=drop_normal@entry=0) at beam/io.c:3931
#13 0x00000000005b983e in call_deliver_port_exit (broken_link=<optimized out>, reason=<optimized out>, state=<optimized out>, prt=0x7fe083064788, from=686100088938503, bang_op=0) at beam/io.c:2178
#14 port_sig_exit (prt=0x7fe083064788, state=<optimized out>, op=<optimized out>, sigdp=0x7fe06f8fcf88) at beam/io.c:2202
#15 0x00000000005e9364 in erts_port_task_execute (runq=runq@entry=0x7fe068388ac0, curr_port_pp=curr_port_pp@entry=0x7fe069dea8e0) at beam/erl_port_task.c:1796
#16 0x0000000000467bc4 in erts_schedule (esdp=<optimized out>, p=<optimized out>, calls=<optimized out>) at beam/erl_process.c:9788
#17 0x00007fe016cd9848 in ?? ()
#18 0x0000000000000000 in ?? ()

Unfortunately I cannot share the core file, but can run any commands that are needed.

itssundeep avatar Jun 07 '22 16:06 itssundeep

In "Thread 5 (LWP 1033)", the 5 is just a number assigned by gdb (I think). The 1033 is the OS thread identifier that should be stored as __data.__owner in pthread_mutex_t. Are there a thread LWP 5 in the core dump? If not then that 5 looks like some sort of memory corruption.

sverker avatar Jun 08 '22 19:06 sverker

It would be interesting to see the content of the DistEntry with the failing mutex:

(gdb) frame 12
(gdb) print *dep

and also the binary containing it:

(gdb) print  *(ErtsMagicBinary*)((char*)dep - (size_t)((ErtsMagicBinary*)0)->u.aligned.data)

sverker avatar Jun 08 '22 20:06 sverker

(gdb) frame 12
(gdb) print *dep
(gdb) p *dep
$1 = {hash_bucket = {next = 0x1fe9d, hvalue = 140600717443072}, next = 0x7fe07732a100, prev = 0x0, rwmtx = {rwmtx = {mtxb = {flgs = {counter = 236638704}, aux_scnt = 32736, main_scnt = 0, qlck = {__data = {__lock = 1, __count = 0, __owner = 5,
            __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x7fe027f9ff18, __next = 0x1fe55}},
          __size = "\001\000\000\000\000\000\000\000\005", '\000' <repeats 15 times>, "\030\377\371'\340\177\000\000U\376\001\000\000\000\000", __align = 1}, q = 0x7fe027d18000}, type = ETHR_RWMUTEX_TYPE_NORMAL, rq_end = 0x0, tdata = {ra = 0x3,
        rs = 3}}}, sysname = 140600744546368, creation = 3, input_handler = {counter = 140600721887760}, cid = 3, connection_id = 695150728, state = 32736, pending_nodedown = 3, suspended_nodeup = 0x7fe0296f21c8, dflags = 3, opts = 695144368,
  mld = 0x3, qlock = {mtx = {pt_mtx = {__data = {__lock = 695138128, __count = 32736, __owner = 3, __nusers = 0, __kind = 670531088, __spins = 32736, __elision = 0, __list = {__prev = 0x3, __next = 0x7fe0296f2f48}},
        __size = "P\367n)\340\177\000\000\003\000\000\000\000\000\000\000\020~\367'\340\177\000\000\003\000\000\000\000\000\000\000H/o)\340\177\000", __align = 140600744539984}}}, qflgs = {counter = 6}, qsize = {counter = 140600720033648}, in = {
    counter = 3}, out = {counter = 140600744550208}, out_queue = {first = 0x3, last = 0x7fe027f6a510}, suspended = 0x3, tmp_out_queue = {first = 0x7fe02819e3b8, last = 0x3}, finalized_out_queue = {first = 0x7fe02819e2d8, last = 0x3},
  dist_cmd_scheduled = {counter = 140600720151888}, dist_cmd = {counter = 3}, send = 0x7fe0296f1e68, cache = 0x3, later_op = {later = 140600744536808, func = 0x3, data = 0x7fe0296eefa0, next = 0x21}, sequences = 0x7fe027f98a20}
(gdb)

and also the binary containing it:

(gdb) print  *(ErtsMagicBinary*)((char*)dep - (size_t)((ErtsMagicBinary*)0)->u.aligned.data)
(gdb) print  *(ErtsMagicBinary*)((char*)dep - (size_t)((ErtsMagicBinary*)0)->u.aligned.data)
$2 = {intern = {flags = 0, refc = {counter = 0}}, orig_size = 0, destructor = 0x0, refn = {0, 0, 0}, alloc_type = 0, u = {aligned = {data = "\235"}, unaligned = {data = "\235"}}}
(gdb)

itssundeep avatar Jun 09 '22 07:06 itssundeep

What kind of Erlang node distribution are you running? Default TCP, TLS or something else?

sverker avatar Jun 09 '22 17:06 sverker

What kind of Erlang node distribution are you running? Default TCP, TLS or something else?

We use kernel TLS offload, which is a customized proto_dist (we did not change anything in OTP). The protocol is basically a copy of inet_tcp_dist with few lines of changes, it uses the inet_drv port as dist control, just with some extra inet:setops on the port

The extra inet:setops are like:

inet:setopts(Port, [{raw, 6, 31, <<"tls">>}]),
inet:setopts(Port, [{raw, 282, 1, <<4,3,52,0:424>>}]),
inet:setopts(Port, [{raw, 282, 2, <<4,3,52,0:424>>}]),

zzydxm avatar Jun 09 '22 17:06 zzydxm

another coredump that might related to this, happening on GC:

(gdb) bt
#0  0x0000000000682f4a in ethr_native_atomic64_add_return_mb (incr=-1, var=0x1d) at ../include/internal/x86_64/../i386/atomic.h:240
#1  ethr_atomic_add_read (val=-1, var=0x1d) at ../include/internal/ethr_atomics.h:4219
#2  ethr_atomic_dec_read (var=0x1d) at ../include/internal/ethr_atomics.h:4806
#3  erts_refc_dectest (min_val=0, refcp=0x1d) at beam/sys.h:1029
#4  erts_deref_node_entry (term=140250483628434, np=0xd) at beam/erl_node_tables.h:334
#5  sweep_off_heap (p=p@entry=0x7f8da1341060, fullsweep=fullsweep@entry=0) at beam/erl_gc.c:3048
#6  0x0000000000688ac8 in do_minor (p=p@entry=0x7f8da1341060, live_hf_end=live_hf_end@entry=0x7f8d439e4148, mature=mature@entry=0x7f8e9c418960 "\300", mature_size=mature_size@entry=1848, new_sz=10958, objv=objv@entry=0x7f8d177e2c38, 
    nobj=1) at beam/erl_gc.c:1750
#7  0x000000000068d0ba in minor_collection (recl=<synthetic pointer>, ygen_usage=6949, nobj=1, objv=0x7f8d177e2c38, need=0, live_hf_end=0x7f8d439e4148, p=0x7f8da1341060) at beam/erl_gc.c:1434
#8  garbage_collect (p=p@entry=0x7f8da1341060, live_hf_end=live_hf_end@entry=0x7f8d439e4148, need=need@entry=0, objv=objv@entry=0x7f8d177e2c38, nobj=nobj@entry=1, fcalls=3621, max_young_gen_usage=0) at beam/erl_gc.c:746
#9  0x000000000068e751 in erts_gc_after_bif_call_lhf (arity=1, regs=0x7f8d177e2d00, result=140244701522290, live_hf_end=0x7f8d439e4148, p=0x7f8da1341060) at beam/erl_gc.c:448
#10 erts_gc_after_bif_call_lhf (p=0x7f8da1341060, live_hf_end=<optimized out>, result=140244701522290, regs=0x7f8d177e2d00, arity=1) at beam/erl_gc.c:414
#11 0x00007f8d3526c60b in ?? ()
#12 0x0000000000000000 in ?? ()
(gdb) frame 5
#5  sweep_off_heap (p=p@entry=0x7f8da1341060,
    fullsweep=fullsweep@entry=0) at beam/erl_gc.c:3048
3048	beam/erl_gc.c: No such file or directory.
(gdb) print ptr
$1 = (struct erl_off_heap_header *) 0x7f8e9c419990
(gdb) print ptr->thing_word
$2 = 352
(gdb) p ((ExternalThing*)ptr)->node
$3 = (struct erl_node_ *) 0x1de5

0x1de5 seems not a right erl_node_*

zzydxm avatar Jun 09 '22 17:06 zzydxm

What OTP version is this?

sverker avatar Jun 09 '22 17:06 sverker

What OTP version is this?

also OTP 25 RC3, should be ee8872b0f2528985306e8b2299920eb445a0c30f

zzydxm avatar Jun 09 '22 17:06 zzydxm

I recommend

  • Run released version. OTP-25.0.1 just released today. Maybe these are bugs already fixed.
  • Try run debug compiled emulator to hopefully get earlier crashes with nicer core dumps.

sverker avatar Jun 09 '22 18:06 sverker

I also experienced a segfault that may be related to this issue. The coredump shows the following backtrace:

#0  0x000000000061786f in erts_do_net_exits (dep=<optimized out>, reason=<optimized out>) at beam/dist.c:940
#1  0x000000000046c4f7 in erts_continue_exit_process (p=0x16f4c68) at beam/erl_process.c:13975
#2  0x00007fc6f0eccfe0 in erts_beamasm:continue_exit/0 ()
#3  0x00007fc6e837ada8 in proc_lib:exit_p/3 () at proc_lib.erl:261
#4  0x00007fc6f0eccf38 in erts_beamasm:normal_exit/0-CodeInfoPrologue ()

I confirmed that pending[i] was NULL at abort_pending_connection(pending[i], pending[i]->connection_id, NULL); which caused a segfault at https://github.com/erlang/otp/blob/09c601fa2183d4c545791ebcd68f869a5ab912a4/erts/emulator/beam/dist.c#L956

d1jang avatar Jul 22 '22 23:07 d1jang

I've looked at the segfault above with pending[i] == NULL. I don't see how that can happen without some wild memory corruption bug.

Another way to detect memory corruption bugs is to instrument the emulator with AddressSanitizer. Here are some newly written instructions if you want to give it a try: https://erlang.github.io/prs/6121/doc/tutorial/debugging.html#address-sanitizer

sverker avatar Aug 16 '22 18:08 sverker

This should be related to https://github.com/erlang/otp/issues/6247

zzydxm avatar Aug 26 '22 18:08 zzydxm

I'm closing this as it looks similar to #6247 fixed by #6258 in OTP 25.1.

sverker avatar Sep 26 '22 10:09 sverker