folly icon indicating copy to clipboard operation
folly copied to clipboard

Non-deterministic thread exit destruction order leads to coredump at LocalLifTime destruction function.

Open ehds opened this issue 2 years ago • 1 comments

Compiler: clang version 14.0.6 Target: Ubuntu x86_64-pc-linux-gnu Link libraries: libcxxabilibcxx

The coredump stacktrace:

* thread #23, name = 'main', stop reason = signal SIGSEGV: invalid address (fault address: 0x38)
  * frame #0: 0x0000000000f1b2b3 metaserver`folly::ThreadLocalPtr<folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::Wrapper, folly::hazptr_tc_tls_tag, void>::get(this=0x00007fff74000bd0) const at ThreadLocal.h:153:30
    frame #1: 0x0000000000f17fe1 metaserver`folly::ThreadLocal<folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::Wrapper, folly::hazptr_tc_tls_tag, void>::operator*() const [inlined] folly::ThreadLocal<folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::Wrapper, folly::hazptr_tc_tls_tag, void>::get(this=0x00007fff74000bd0) const at ThreadLocal.h:69:27
    frame #2: 0x0000000000f17fdc metaserver`folly::ThreadLocal<folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::Wrapper, folly::hazptr_tc_tls_tag, void>::operator*(this=0x00007fff74000bd0) const at ThreadLocal.h:78:34
    frame #3: 0x0000000000f17d65 metaserver`folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::getWrapper() at SingletonThreadLocal.h:142:56
    frame #4: 0x0000000000f17d9c metaserver`folly::SingletonThreadLocal<folly::hazptr_tc<std::__1::atomic>, folly::hazptr_tc_tls_tag, folly::detail::DefaultMake<folly::hazptr_tc<std::__1::atomic> >, folly::hazptr_tc_tls_tag>::LocalLifetime::~LocalLifetime(this=0x00007fffa9ff31e0) at SingletonThreadLocal.h:115:23
    frame #5: 0x00000000029a3ca6 metaserver`__cxxabiv1::(anonymous namespace)::run_dtors((null)=0x00000000035ba7e0) at cxa_thread_atexit.cpp:78:7
    frame #6: 0x00007ffff7f755a1 libpthread.so.0`__nptl_deallocate_tsd.part.0 + 145
    frame #7: 0x00007ffff7f7662a libpthread.so.0`start_thread + 250
    frame #8: 0x00007ffff7b15133 libc.so.6`__clone + 67

Analysis:

Clang compiler will register thread local variable destructor using pthread_key_create when we define a thread_local variable:

https://github.com/llvm/llvm-project/blob/72777dc000ac432a99cf5f591553127432bd0365/libcxxabi/src/cxa_thread_atexit.cpp#L91

Folly also register StaticMetaBase cleanup function using pthread_key_create. https://github.com/facebook/folly/blob/8c52d79a616f7a89095d3121a7f02394c16c0848/folly/detail/ThreadLocalDetail.cpp#L73

But the order of destructor calls is unspecified if more than one destructor exists for a thread when it exits according to pthread_key_create description, so the destructor of LocalLifetime would be called after StaticMetaBase::onThreadExit https://github.com/facebook/folly/blob/8c52d79a616f7a89095d3121a7f02394c16c0848/folly/SingletonThreadLocal.h#L145

Unfortunately, threadEntryList would be cleared before the destructor of LocalLifetime to access it: https://github.com/facebook/folly/blob/8c52d79a616f7a89095d3121a7f02394c16c0848/folly/detail/ThreadLocalDetail.h#L426

ehds avatar Apr 10 '23 12:04 ehds

@yfeldblum PTAL

ehds avatar Apr 13 '23 05:04 ehds