AddressSanitizer: heap-use-after-free during shutdown of Logger
Describe the bug
During shutdown of the CRT; if a thread takes longer time to respond than expected we might get a use after free. This can be seen when running a build using the address sanitizer (-DENABLE_ADDRESS_SANITIZER=ON, see logs below)
The issue seems to be predicted in the following comment: https://github.com/aws/aws-sdk-cpp/pull/2885/files#r1567933028
Regression Issue
- [ ] Select this option if this issue appears to be a regression.
Expected Behavior
Not intermittently having use after free issues.
Current Behavior
Here are an output from a run when SUMMARY: AddressSanitizer: heap-use-after-free is hit during shutdown.
Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) /path/aws-sdk-cpp/src/aws-cpp-sdk-core/source/utils/logging/CRTLogging.cpp:59:42
#1 0x7f2a17209f67 in s_destroy /path/aws-c-io/source/linux/epoll_event_loop.c:233:5
#2 0x7f2a171e87de in s_aws_event_loop_group_shutdown_sync /path/aws-c-io/source/event_loop.c:43:13
#3 0x7f2a171ebde5 in s_event_loop_destroy_async_thread_fn /path/aws-c-io/source/event_loop.c:55:5
#4 0x7f2a18a53bfb in thread_fn /path/aws-c-common/source/posix/thread.c:177:5
#5 0x7f2a1bcfdaf6 (/usr/local/lib/clang/21/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.so+0x147af6)
#6 0x7f2a19b8af6b in start_thread /usr/src/debug/glibc-2.38-150600.14.37.1.x86_64/nptl/pthread_create.c:444:8
#7 0x7f2a19c12387 in __GI___clone3 /usr/src/debug/glibc-2.38-150600.14.37.1.x86_64/misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
0x7b5a153e2120 is located 16 bytes inside of 24-byte region [0x7b5a153e2110,0x7b5a153e2128)
freed by thread T0 here:
#0 0x7f2a1bd1107d in operator delete(void*) (/usr/local/lib/clang/21/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.so+0x15b07d)
#1 0x7f2a19b3dba0 in __cxa_finalize /usr/src/debug/glibc-2.38-150600.14.37.1.x86_64/stdlib/cxa_finalize.c:82:6
previously allocated by thread T1 here:
#0 0x7f2a1bd1081d in operator new(unsigned long) (/usr/local/lib/clang/21/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.so+0x15a81d)
#1 0x7f2a18dfe25e in std::__new_allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>::allocate(unsigned long, void const*) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/new_allocator.h:151:27
#2 0x7f2a18dfe25e in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>>::allocate(std::allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>&, unsigned long) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/alloc_traits.h:614:20
#3 0x7f2a18dfe25e in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>> std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>>(std::allocator<std::_Sp_counted_ptr_inplace<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, (__gnu_cxx::_Lock_policy)2>>&) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/allocated_ptr.h:102:21
#4 0x7f2a18dfe25e in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, Aws::Utils::Logging::LogLevel const&>(Aws::Utils::Logging::DefaultCRTLogSystem*&, std::_Sp_alloc_shared_tag<std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>>, Aws::Utils::Logging::LogLevel const&) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr_base.h:967:19
#5 0x7f2a18dfe25e in std::__shared_ptr<Aws::Utils::Logging::DefaultCRTLogSystem, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, Aws::Utils::Logging::LogLevel const&>(std::_Sp_alloc_shared_tag<std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>>, Aws::Utils::Logging::LogLevel const&) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr_base.h:1719:14
#6 0x7f2a18dfe25e in std::shared_ptr<Aws::Utils::Logging::DefaultCRTLogSystem>::shared_ptr<std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, Aws::Utils::Logging::LogLevel const&>(std::_Sp_alloc_shared_tag<std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>>, Aws::Utils::Logging::LogLevel const&) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr.h:463:4
#7 0x7f2a18dfe25e in std::shared_ptr<std::enable_if<!is_array<Aws::Utils::Logging::DefaultCRTLogSystem>::value, Aws::Utils::Logging::DefaultCRTLogSystem>::type> std::allocate_shared<Aws::Utils::Logging::DefaultCRTLogSystem, std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem>, Aws::Utils::Logging::LogLevel const&>(std::allocator<Aws::Utils::Logging::DefaultCRTLogSystem> const&, Aws::Utils::Logging::LogLevel const&) /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr.h:990:14
#8 0x7f2a18dfe25e in std::shared_ptr<Aws::Utils::Logging::DefaultCRTLogSystem> Aws::MakeShared<Aws::Utils::Logging::DefaultCRTLogSystem, Aws::Utils::Logging::LogLevel const&>(char const*, Aws::Utils::Logging::LogLevel const&) /path/aws-sdk-cpp/src/aws-cpp-sdk-core/include/aws/core/utils/memory/stl/AWSAllocator.h:117:16
#9 0x7f2a18dfe25e in Aws::InitAPI(Aws::SDKOptions const&) /path/aws-sdk-cpp/src/aws-cpp-sdk-core/source/Aws.cpp:72:25
...
Thread T19 created by T0 here:
#0 0x7f2a1bce42f1 in pthread_create (/usr/local/lib/clang/21/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.so+0x12e2f1)
#1 0x7f2a18a52593 in aws_thread_launch /path/aws-c-common/source/posix/thread.c:352:19
#2 0x7f2a171e823e in s_aws_event_loop_group_shutdown_async /path/aws-c-io/source/event_loop.c:75:5
#3 0x7f2a18a5a96c in aws_ref_count_release /path/aws-c-common/source/ref_count.c:29:9
#4 0x7f2a171f89a4 in s_cleanup_default_resolver /path/aws-c-io/./3pp/sources/aws-c-io/source/host_resolver.c:330:5
#5 0x7f2a18a5a96c in aws_ref_count_release /path/aws-c-common/source/ref_count.c:29:9
#6 0x7f2a171d3b23 in s_client_bootstrap_destroy_impl /path/aws-c-io/source/channel_bootstrap.c:31:5
#7 0x7f2a18a5a96c in aws_ref_count_release /path/aws-c-comm on/source/ref_count.c:29:9
#8 0x7f2a17de36c8 in Aws::Crt::Io::ClientBootstrap::~ClientBootstrap() /path/aws-crt-cpp/source/io/Bootstrap.cpp:91:21
#9 0x7f2a1b92db2e in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr_base.h:345:8
#10 0x7f2a1b92db2e in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/local/bin/../lib/gcc/x86_64-linux-gnu/15.2.0/../../../../include/c++/15.2.0/bits/shared_ptr_base.h:1069:11
#11 0x7f2a19b3dba0 in __cxa_finalize /usr/src/debug/glibc-2.38-150600.14.37.1.x86_64/stdlib/cxa_finalize.c:82:6
Thread T1 created by T0 here:
#0 0x7f2a1bce42f1 in pthread_create (/usr/local/lib/clang/21/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.so+0x12e2f1)
#1 0x7f2a1b40b772 in mt_thread_create /path/mt_thread.c:38:38
...
#4 0x55e5796c45a9 in main /path/main.c:512:5
#5 0x7f2a19b24e6b in __libc_start_call_main /usr/src/debug/glibc-2.38-150600.14.37.1.x86_64/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
SUMMARY: AddressSanitizer: heap-use-after-free /path/aws-sdk-cpp/src/aws-cpp-sdk-core/source/utils/logging/CRTLogging.cpp:59:42 in Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int)
Reproduction Steps
Will investigate if an existing aws-sdk-cpp test can be modified to reproduce.
Possible Solution
No response
Additional Information/Context
No response
AWS CPP SDK version used
1.11.479 and later
Compiler and Version used
clang 21
Operating System and version
Linux, Ubuntu 24.04
Thanks for bringing this up. This is on our radar and something we will address in the future, as the fix/refactor is not trivial to resolve this race condition. In the meantime, is it possible for you to wait for threads prior to shutting down the SDK?
Thanks for bringing this up. This is on our radar and something we will address in the future, as the fix/refactor is not trivial to resolve this race condition. In the meantime, is it possible for you to wait for threads prior to shutting down the SDK?
Maybe I don't catch your point, but the logic is in SDK itself, do you mean we make a change in the SDK to wait more time before finally shutting down CRTLogging?
If this issue is impacting you I'm assuming your application is multithreaded and sharing a single SDK instance. Unless you're initializing and shutting down the SDK inside each thread (which would severely impact performance), you likely have a main thread handling that lifecycle. My question is whether it's possible in your setup to wait for all worker threads to finish before shutting down the SDK, as a temporary workaround for this race condition. If this doesn't apply to your scenario, feel free to disregard; we will look at this issue in the future.
If this issue is impacting you I'm assuming your application is multithreaded and sharing a single SDK instance. Unless you're initializing and shutting down the SDK inside each thread (which would severely impact performance), you likely have a main thread handling that lifecycle. My question is whether it's possible in your setup to wait for all worker threads to finish before shutting down the SDK, as a temporary workaround for this race condition. If this doesn't apply to your scenario, feel free to disregard; we will look at this issue in the future.
Hi, I have tried your suggested workaround and it works. Thanks very much for the support.