backward-cpp icon indicating copy to clipboard operation
backward-cpp copied to clipboard

Don't try to print stack trace if we're inside malloc

Open davispuh opened this issue 6 years ago • 3 comments

If signal handler is called inside malloc we might deadlock, issue #47

Got this deadlock with jemalloc

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f3a01900c55 in __GI___pthread_mutex_lock (mutex=0x7f39fd400310) at ../nptl/pthread_mutex_lock.c:80
#2  0x00007f39ff225327 in je_tcache_alloc_small_hard (tcache=tcache@entry=0x7f39fd018000, tbin=0x130, tbin@entry=0x7f39fd018068, binind=binind@entry=2) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/tcache.c:72
#3  0x00007f39ff207f8e in je_tcache_alloc_small (zero=false, size=2, tcache=0x7f39fd018000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/tcache.h:303
#4  je_arena_malloc (try_tcache=true, zero=false, size=2, arena=0x0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/arena.h:957
#5  je_imalloct (arena=0x0, try_tcache=true, size=2) at include/jemalloc/internal/jemalloc_internal.h:771
#6  je_imalloc (size=2) at include/jemalloc/internal/jemalloc_internal.h:780
#7  malloc (size=size@entry=28) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/jemalloc.c:929
#8  0x00007f3a00714a9e in operator new (sz=28) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/gcc-5.4.0/libstdc++-v3/libsupc++/new_op.cc:50
#9  0x00007f3a007677ef in __gnu_cxx::new_allocator<char>::allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:104
#10 std::string::_Rep::_S_create (__capacity=__capacity@entry=3, __old_capacity=__old_capacity@entry=0, __alloc=...) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:1051
#11 0x00007f3a00767a3e in std::string::_S_construct<char const*> (__beg=0x7f39fcfff010 "#19", __end=<optimized out>, __a=...)
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:582
#12 0x00007f3a0076aafd in std::string::_S_construct_aux<char const*> (__a=..., __end=<optimized out>, __beg=0x7f39fcfff010 "#19")
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4141
#13 std::string::_S_construct<char const*> (__a=..., __end=<optimized out>, __beg=0x7f39fcfff010 "#19") at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:4162
#14 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (this=0x7f39fcfff000, __s=0x7f39fcfff010 "#19", __a=...)
    at /usr/src/debug/sys-devel/gcc-5.4.0-r3/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:659
#15 0x00000000009055e8 in backward::Printer::print_trace (this=0x7f39fcfff9e0, os="Stack trace (most recent call last):\n", trace=..., colorize=...) at backward/backward.hpp:1779
#16 0x00000000009071eb in backward::Printer::print<backward::StackTrace> (this=this@entry=0x7f39fcfff9e0, st=..., os="Stack trace (most recent call last):\n") at backward/backward.hpp:1745
#17 0x0000000000b4512b in backward::SignalHandling::sig_handler (info=0x7f39fcfffbf0, _ctx=<optimized out>) at backward/backward.hpp:1965
#18 <signal handler called>
#19 je_bitmap_set (bit=18446744073709551553, binfo=0x7f39ff4309f8 <je_arena_bin_info+248>, bitmap=0x7f39fd394010) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/bitmap.h:105
#20 je_bitmap_sfu (binfo=0x7f39ff4309f8 <je_arena_bin_info+248>, bitmap=0x7f39fd394010) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/bitmap.h:140
#21 arena_run_reg_alloc (bin_info=0x7f39ff4309d0 <je_arena_bin_info+208>, run=0x7f39fd394000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/arena.c:291
#22 je_arena_tcache_fill_small (arena=0x7f39fd4000c0, tbin=tbin@entry=0x7f39fd018068, binind=binind@entry=2, prof_accumbytes=prof_accumbytes@entry=0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/arena.c:1479
#23 0x00007f39ff225327 in je_tcache_alloc_small_hard (tcache=tcache@entry=0x7f39fd018000, tbin=tbin@entry=0x7f39fd018068, binind=binind@entry=2) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/tcache.c:72
#24 0x00007f39ff207f8e in je_tcache_alloc_small (zero=false, size=32, tcache=0x7f39fd018000) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/tcache.h:303
#25 je_arena_malloc (try_tcache=true, zero=false, size=2, arena=0x0) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/include/jemalloc/internal/arena.h:957
#26 je_imalloct (arena=0x0, try_tcache=true, size=2) at include/jemalloc/internal/jemalloc_internal.h:771
#27 je_imalloc (size=2) at include/jemalloc/internal/jemalloc_internal.h:780
#28 malloc (size=size@entry=24) at /usr/src/debug/dev-libs/jemalloc-3.6.0/jemalloc-3.6.0/src/jemalloc.c:929
#29 0x00007f3a00714a9e in operator new (sz=sz@entry=24) at /usr/src/debug/sys-devel/gcc-5.4.0-r3/gcc-5.4.0/libstdc++-v3/libsupc++/new_op.cc:50

This PR tries to avoid that by skipping backtrace output in such case.

davispuh avatar Sep 18 '17 10:09 davispuh

I just ran into this exact issue today. One option to still have a nice backtrace (which, frankly is awesone: it changes how I develop), would be to have a static array of char in backwards.cpp or in backwards itself and use that to at least give a basic stack in this case. Admittedly that is hard since new and malloc are deeply ingrained in C++ and so avoiding new and malloc entirely can be hard (replacing the new operation for some classes might help).

Just my 2c since I ran into this (I sat around forevre waiting for my process to finish only to fire up gdb and find this stack above).

travisdowns avatar Dec 02 '17 22:12 travisdowns

Sadly, the libraries used behind are using malloc like its free ;) So we cannot easily workaround that unless we write a zero allocation dwarf decoder... or patch the libraries.

On Sat, Dec 2, 2017 at 2:28 PM travisdowns [email protected] wrote:

I just ran into this exact issue today. One option to still have a nice backtrace (which, frankly is awesone: it changes how I develop), would be to have a static array of char in backwards.cpp or in backwards itself and use that to at least give a basic stack in this case. Admittedly that is hard since new and malloc are deeply ingrained in C++ and so avoiding new and malloc entirely can be hard (replacing the new operation for some classes might help).

Just my 2c since I ran into this (I sat around forevre waiting for my process to finish only to fire up gdb and find this stack above).

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/bombela/backward-cpp/pull/74#issuecomment-348724816, or mute the thread https://github.com/notifications/unsubscribe-auth/AANMjMWmRzemqyD7gfMVuEgs487E8gLpks5s8c8egaJpZM4PauEG .

bombela avatar Dec 04 '17 21:12 bombela

@bombela - yup, that's true, although note from the stack above that the stack was successfully taken, so at least those calls went through (but that could just be random, perhaps it simply happens they do few enough allocations that malloc doesn't need to take any locks and you don't deadlock).

Probably just spitting out a message like "we handled this but are skipping the stack trace to avoid deadlock" is best. The call write is safe to call from a signal handler, so at least on Linux it can be printed to stderr. At least then the user will get a core and can use the debugger to look at the backtrace.

travisdowns avatar Dec 05 '17 04:12 travisdowns