aws-sdk-cpp
aws-sdk-cpp copied to clipboard
Memory leak if s2n_cleanup is not called
Confirm by changing [ ] to [x] below to ensure that it's a bug:
- [x] I've gone though Developer Guide and API reference
- [x] I've searched for previous similar issues and didn't find any solution
Describe the bug When upgrading from 1.8.x version to 1.9.49 we see memory leaks reported by the ASAN tool. See ASAN log below.
SDK version number 1.9.49
Platform/OS/Hardware/Device Linux Debian
To Reproduce (observed behavior) We create Aws::S3::S3Client objects on several threads. When program is terminated, several leaks are reported.
When we add a call to s2n_cleanup, https://github.com/aws/s2n-tls/blob/main/docs/USAGE-GUIDE.md#s2n_cleanup, during shutdown for each thread created there are no leaks reported.
Expected behavior No memory leaks. Do we really need to call s2n_cleanup explicitly just because the implementation creates a UUID?
Logs/output
1
malloc
/usr/local/bin/engine
2
CRYPTO_zalloc
/usr/local/bin/engine
3
s2n_defend_if_forked
/home/vagrant/jws/qix-pc3-build-ws/build/lib/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:144:9
4
s2n_get_private_random_data
/home/vagrant/jws/qix-pc3-build-ws/build/lib/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:175:5
5
s2n_openssl_compat_rand
/home/vagrant/jws/qix-pc3-build-ws/build/lib/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:295:29
6
…::SecureRandomBytes_OpenSSLImpl::GetBytes
/home/vagrant/jws/qix-pc3-build-ws/build/lib/aws-cpp-sdk-core/source/utils/crypto/openssl/CryptoImpl.cpp:142:31
7
…::UUID::RandomUUID
/home/vagrant/jws/qix-pc3-build-ws/build/lib/aws-cpp-sdk-core/source/utils/UUID.cpp:79:27
8
…::STSAssumeRoleWebIdentityCredentialsProvider::STSAssumeRoleWebIdentityCredentialsProvider
/home/vagrant/jws/qix-pc3-build-ws/build/lib/aws-cpp-sdk-core/source/auth/STSCredentialsProvider.cpp:91:25
9
…::__1::__compressed_pair_elem<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2196:46
10
…::__1::__compressed_pair<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2289:42
11
…::__1::__shared_ptr_emplace<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:3562:12
12
…::__1::shared_ptr<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:4195:9
13
…::__1::enable_if<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:4415:12
14
…::__1::shared_ptr<…>
/home/vagrant/jws/qix-pc3-build-ws/build/lib/aws-cpp-sdk-core/include/aws/core/utils/memory/stl/AWSAllocator.h:51:16
15
…::DefaultAWSCredentialsProviderChain::DefaultAWSCredentialsProviderChain
/home/vagrant/jws/qix-pc3-build-ws/build/lib/aws-cpp-sdk-core/source/auth/AWSCredentialsProviderChain.cpp:41:17
16
…::__1::__compressed_pair_elem<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2196:46
17
…::__1::__compressed_pair<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2289:42
18
…::__1::__shared_ptr_emplace<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:3562:12
19
…::__1::shared_ptr<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:4195:9
20
…::__1::enable_if<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:4415:12
21
…::__1::shared_ptr<…>
/home/vagrant/jws/engine-common-ws/build/HEAD-release/../../Packages/aws-sdk-cpp/include/aws/core/utils/memory/stl/AWSAllocator.h:51:16
22
GetCredentialProvider
/home/vagrant/jws/engine-common-ws/build/HEAD-release/../../src/qaws/src/S3Client.cpp:115:12
23
…::S3Client::S3Client
/home/vagrant/jws/engine-common-ws/build/HEAD-release/../../src/qaws/src/S3Client.cpp:131:9
24
…::__1::__compressed_pair_elem<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2196:46
25
…::__1::__compressed_pair<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:2289:42
26
…::__1::__shared_ptr_emplace<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:3562:12
27
…::__1::enable_if<…>
/usr/lib/llvm-10/bin/../include/c++/v1/memory:4400:26
28
…::S3ObjectStream::S3ObjectStream
The first frame is our own:
Client = Aws::MakeShared<Aws::S3::S3Client>(
"S3Client",
Aws::MakeShared<Aws::Auth::DefaultAWSCredentialsProviderChain>("DefaultAWSCredentialsProviderChain"),
Config,
Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::RequestDependent,
false
);
Additional context Memory leaks were not reported on several 1.8.x versions we have used.
Could you try this workaround (not the fix) to suppress a specific leak in S2N by:
export LSAN_OPTIONS=suppressions=scripts/suppressions.txt
exporting the above (or should I have the file present with some content?) doesn't change a bit.
==261254== 6,336 bytes in 24 blocks are indirectly lost in loss record 6 of 6
==261254== at 0x4842839: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==261254== by 0x4A726DD: CRYPTO_zalloc (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==261254== by 0x4A63754: EVP_CipherInit_ex (in /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
==261254== by 0x50B9251: s2n_drbg_instantiate (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/crypto/s2n_drbg.c:162)
==261254== by 0x50AE15D: s2n_defend_if_forked (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:143)
==261254== by 0x50AE210: s2n_get_private_random_data (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:175)
==261254== by 0x50AE544: s2n_openssl_compat_rand (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_random.c:295)
==261254== by 0x4F9DF69: Aws::Utils::Crypto::SecureRandomBytes_OpenSSLImpl::GetBytes(unsigned char*, unsigned long) (aws-sdk-cpp/aws-cpp-sdk-core/source/utils/crypto/openssl/CryptoImpl.cpp:142)
==261254== by 0x4F88337: Aws::Utils::UUID::RandomUUID() (aws-sdk-cpp/aws-cpp-sdk-core/source/utils/UUID.cpp:79)
and
==261254== 1,560 bytes in 20 blocks are still reachable in loss record 2 of 6
==261254== at 0x4842839: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==261254== by 0x50AD85C: s2n_mem_malloc_no_mlock_impl (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_mem.c:126)
==261254== by 0x50ACC25: s2n_realloc (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_mem.c:195)
==261254== by 0x50AC9B2: s2n_alloc (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_mem.c:157)
==261254== by 0x50AD3D5: s2n_dup (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_mem.c:240)
==261254== by 0x5104172: s2n_cipher_suites_init (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/tls/s2n_cipher_suites.c:1029)
==261254== by 0x50AB632: s2n_init (aws-sdk-cpp/crt/aws-crt-cpp/crt/s2n/utils/s2n_init.c:44)
==261254== by 0x505C657: aws_tls_init_static_state (aws-sdk-cpp/crt/aws-crt-cpp/crt/aws-c-io/source/s2n/s2n_tls_channel_handler.c:132)
==261254== by 0x5050CF2: aws_io_library_init (aws-sdk-cpp/crt/aws-crt-cpp/crt/aws-c-io/source/io.c:213)
==261254== by 0x4FDAF58: aws_mqtt_library_init (aws-sdk-cpp/crt/aws-crt-cpp/crt/aws-c-mqtt/source/mqtt.c:186)
==261254== by 0x4FBB422: Aws::Crt::s_initApi(aws_allocator*) (aws-sdk-cpp/crt/aws-crt-cpp/source/Api.cpp:39)
==261254== by 0x4FBB3E1: Aws::Crt::ApiHandle::ApiHandle(aws_allocator*) (aws-sdk-cpp/crt/aws-crt-cpp/source/Api.cpp:52)
I think this is the same issue:
https://github.com/aws/aws-sdk-cpp/issues/1757
Could you try this workaround (not the fix) to suppress a specific leak in S2N by:
export LSAN_OPTIONS=suppressions=scripts/suppressions.txt
This worked. But could you propose a fix? This memory leak is quite a show stopper for us.
Kind Reminder of this issue.
Aren't there any other people out there experience this issue?
Kind monthly reminder of this issue :)
We are facing the same issue. We make a lot of s3-crt GetObjectAsync calls, each one of which spins up a new thread (as we are using the DefaultExecutor) and adds to the memory usage.
@theShmoo how exactly did you get the export LSAN_OPTIONS
hack working? It isn't helping me. [Update: Never mind. Just realized that this is just an option on the leak sanitizer to suppress warnings about this leak]
Hoi! Nice to hear that we are not alone!
We use LSAN as part of ASAN and if you use a suppression list you need to configure ASAN to recover from failures:
-fsanitize-recover=address -fsanitize-address-use-after-scope
these are the compile flags we use in addition to -fsanitize=address
Certainly you are not alone. Suppressing the warnings does not feel so right :| We end up using thread_local
to call s2n_cleanup();
whenever a thread is terminated. Now valgrind
does not complain any more. We are happy again.
File: S2nCleanup.h
#include <s2n.h>
class S2nCleanup {
public:
~S2nCleanup() {
s2n_cleanup();
}
};
Any where we use s3Client
.
thread_local S2nCleanup s2nCleanup{};
...
Aws::S3::Model::GetObjectOutcome outcome = s3Client->GetObject(request);
...
Nice! Thanks for your tipp with the thread_local variable. This also works for my use case.
Maybe it will help others as well:
I use s3 in combination with https://github.com/yhirose/cpp-httplib
to get your fix working I need to create a custom thread pool and add the thread_local s2n clean up:
proof of concept:
struct thread_cleanup
{
~thread_cleanup()
{
s2n_cleanup();
}
};
class task_queue : public httplib::TaskQueue
{
public:
explicit task_queue(size_t n) : m_pool(n) {}
void enqueue(std::function<void()> fn) override
{
m_pool.enqueue(
[f = std::move(fn)]
{
f();
thread_local thread_cleanup cleanup{};
});
}
void shutdown() override
{
m_pool.shutdown();
}
private:
httplib::ThreadPool m_pool;
};
// when configuring the server:
m_server->new_task_queue = []
{
// NOLINTNEXTLINE
return new task_queue(12);
};
This PR fixed a memory leak. Please let me know if you are still seeing any more leaks with the current version of this sdk.
Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.