aws-sdk-cpp
aws-sdk-cpp copied to clipboard
Shutdown API Segfault and Valgrind Issues: Aws::Utils::Crypto::CleanupCrypto() + CleanupHttp()
- [x] I've gone though Developer Guide and API reference
- [x] I've searched for previous similar issues and didn't find any solution
Describe the bug SDK Shutdown segfaults and has valgrind memory issues, apparently around CleanupHttp / CleanupCrypto. This issue only seems to happen if a static pointer to a class which inits and cleans up the SDK is used. Curious why this would be the case.
SDK version number 1.8.113
Platform/OS/Hardware/Device Fedora 33, (GCC) 10.2.1 20201125
To Reproduce (observed behavior)
#include <memory>
#include <aws/core/Aws.h>
class AwsLifetime {
public:
AwsLifetime() {
Aws::InitAPI(options);
}
~AwsLifetime() {
Aws::ShutdownAPI(options);
}
private:
Aws::SDKOptions options;
};
static std::unique_ptr<AwsLifetime> aws;
int main() {
aws = std::make_unique<AwsLifetime>();
return 0;
}
$ g++ -std=c++17 -Wall -Werror -O2 -laws-cpp-sdk-core -o test test.cpp
$ valgrind --tool=memcheck --leak-check=full ./test
Expected behavior Runs, clean valgrind output. With SDK version 1.7.323, (GCC) 10.1.1 20200507 Fedora 32
==3600== Memcheck, a memory error detector
==3600== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3600== Using Valgrind-3.16.0 and LibVEX; rerun with -h for copyright info
==3600== Command: ./test
==3600==
==3600==
==3600== HEAP SUMMARY:
==3600== in use at exit: 0 bytes in 0 blocks
==3600== total heap usage: 4,050 allocs, 4,050 frees, 237,867 bytes allocated
==3600==
==3600== All heap blocks were freed -- no leaks are possible
==3600==
==3600== For lists of detected and suppressed errors, rerun with: -s
==3600== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Logs/output On Fedora 33 with versions described above First invalid read:
==591656== Memcheck, a memory error detector
==591656== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==591656== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==591656== Command: ./test
==591656==
==591656== Invalid read of size 8
==591656== at 0x48C8809: Aws::Http::CleanupHttp() (in /usr/lib64/libaws-cpp-sdk-core.so)
==591656== by 0x490A27C: Aws::ShutdownAPI(Aws::SDKOptions const&) (in /usr/lib64/libaws-cpp-sdk-core.so)
==591656== by 0x4014A7: std::unique_ptr<AwsLifetime, std::default_delete<AwsLifetime> >::~unique_ptr() (in /home/fedora/test)
==591656== by 0x4CF6236: __run_exit_handlers (in /usr/lib64/libc-2.32.so)
==591656== by 0x4CF63DF: exit (in /usr/lib64/libc-2.32.so)
==591656== by 0x4CDE1E8: (below main) (in /usr/lib64/libc-2.32.so)
==591656== Address 0x5a87020 is 16 bytes inside a block of size 24 free'd
==591656== at 0x483A9F5: free (vg_replace_malloc.c:538)
==591656== by 0x4CF6236: __run_exit_handlers (in /usr/lib64/libc-2.32.so)
==591656== by 0x4CF63DF: exit (in /usr/lib64/libc-2.32.so)
==591656== by 0x4CDE1E8: (below main) (in /usr/lib64/libc-2.32.so)
==591656== Block was alloc'd at
==591656== at 0x4839809: malloc (vg_replace_malloc.c:307)
==591656== by 0x48C8730: Aws::Http::InitHttp() (in /usr/lib64/libaws-cpp-sdk-core.so)
==591656== by 0x49304F3: Aws::InitAPI(Aws::SDKOptions const&) (in /usr/lib64/libaws-cpp-sdk-core.so)
==591656== by 0x4011A7: main (in /home/fedora/test)
==591656==
Additional context With Logging on simply running the program
[~]$ ./test
[DEBUG] 2020-12-30 16:01:19.044 FileSystemUtils [139993724766464] Environment value for variable HOME is /home/fedora
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Home directory is missing the final / appending one to normalize
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Final Home Directory is /home/fedora/
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSConfigFileProfileConfigLoader [139993724766464] Initializing config loader against fileName /home/fedora/.aws/credentials and using profilePrefix = 0
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Environment value for variable HOME is /home/fedora
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Home directory is missing the final / appending one to normalize
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Final Home Directory is /home/fedora/
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSConfigFileProfileConfigLoader [139993724766464] Initializing config loader against fileName /home/fedora/.aws/config and using profilePrefix = 1
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Environment value for variable HOME is /home/fedora
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Home directory is missing the final / appending one to normalize
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Final Home Directory is /home/fedora/
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSConfigFileProfileConfigLoader [139993724766464] Unable to open config file /home/fedora/.aws/credentials for reading.
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSProfileConfigLoader [139993724766464] Failed to reload configuration.
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Environment value for variable HOME is /home/fedora
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Home directory is missing the final / appending one to normalize
[DEBUG] 2020-12-30 16:01:19.045 FileSystemUtils [139993724766464] Final Home Directory is /home/fedora/
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSConfigFileProfileConfigLoader [139993724766464] Unable to open config file /home/fedora/.aws/config for reading.
[INFO] 2020-12-30 16:01:19.045 Aws::Config::AWSProfileConfigLoader [139993724766464] Failed to reload configuration.
[INFO] 2020-12-30 16:01:19.046 CurlHttpClient [139993724766464] Initializing Curl library with version: 7.71.1, ssl version: OpenSSL/1.1.1i-fips
[DEBUG] 2020-12-30 16:01:19.046 ClientConfiguration [139993724766464] ClientConfiguration will use SDK Auto Resolved profile: [default] if not specified by users.
[WARN] 2020-12-30 16:01:19.046 ClientConfiguration [139993724766464] Retry Strategy will use the default max attempts.
[INFO] 2020-12-30 16:01:19.046 EC2MetadataClient [139993724766464] Creating AWSHttpResourceClient with max connections 2 and scheme http
[INFO] 2020-12-30 16:01:19.046 CurlHandleContainer [139993724766464] Initializing CurlHandleContainer with size 2
[INFO] 2020-12-30 16:01:19.046 CurlHandleContainer [139993724766464] Cleaning up CurlHandleContainer.
Segmentation fault (core dumped)
Any insights would be appreciated
hi @afowles , Thanks for pointing this out to us! As a quick update, I'm able to reproduce this and started investigating on the root cause for these valgrind errors.
Any updates? We are hitting the same issue after aws-sdk update.
The issue seems to have appeared between 1.7.311 and 1.8.159 releases
Hello, I am using 1.7.108 and am getting the same issue. Here is what I am seeing in valgrind.
==426== Invalid read of size 8
==426== at 0x694C10E: Aws::Http::CleanupHttp() (HttpClientFactory.cpp:180)
==426== by 0x6911A98: Aws::ShutdownAPI(Aws::SDKOptions const&) (Aws.cpp:119)
==426== by 0x67122E4: AwsSource::~AwsSource() (IArchiveHandler.hpp:97)
==426== by 0x67171AC: void __gnu_cxx::new_allocator<AwsSource>::destroy<AwsSource>(AwsSource*) (new_allocator.h:124)
==426== by 0x6717024: void std::allocator_traits<std::allocator<AwsSource> >::destroy<AwsSource>(std::allocator<AwsSource>&, AwsSo
urce*) (alloc_traits.h:542)
And later on:
==426== Invalid read of size 4
==426== at 0x66F3702: __gnu_cxx::__exchange_and_add(int volatile*, int) (atomicity.h:49)
==426== by 0x66F37AB: __gnu_cxx::__exchange_and_add_dispatch(int*, int) (atomicity.h:82)
==426== by 0x66F4ABD: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:147)
==426== by 0x66F43F5: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() (shared_ptr_base.h:659)
==426== by 0x6911C97: std::__shared_ptr<Aws::Http::HttpClientFactory, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() (shared_ptr_bas
e.h:925)
==426== by 0x694CE6F: std::__shared_ptr<Aws::Http::HttpClientFactory, (__gnu_cxx::_Lock_policy)2>::operator=(std::__shared_ptr<Aws
::Http::HttpClientFactory, (__gnu_cxx::_Lock_policy)2>&&) (shared_ptr_base.h:1000)
==426== by 0x694CADD: std::shared_ptr<Aws::Http::HttpClientFactory>::operator=(std::shared_ptr<Aws::Http::HttpClientFactory>&&) (s
Still later:
==426== Invalid read of size 8
==426== at 0x6975F65: Aws::Utils::Crypto::CleanupCrypto() (Factories.cpp:655)
==426== by 0x6911A9D: Aws::ShutdownAPI(Aws::SDKOptions const&) (Aws.cpp:120)
==426== by 0x67122E4: AwsSource::~AwsSource() (IArchiveHandler.hpp:97)
==426== by 0x67171AC: void __gnu_cxx::new_allocator<AwsSource>::destroy<AwsSource>(AwsSource*) (new_allocator.h:124)
==426== by 0x6717024: void std::allocator_traits<std::allocator<AwsSource> >::destroy<AwsSource>(std::allocator<AwsSource>&, AwsSo
urce*) (alloc_traits.h:542)
==426== by 0x6716E21: std::_Sp_counted_ptr_inplace<AwsSource, std::allocator<AwsSource>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
(shared_ptr_base.h:531)
Hello! We're facing the same problem on CentOS 7. Any update?
Same issue in 1.9.220 with CentOS7-- gcc 7.3.1 -- believe issue is do to static shared_ptr in HttpClientFactory potentially being cleaned up before other statiic cleanup that can call ShutdownAPI. Probably should avoid statically initialized shared_ptr to avoid these kinds of errors.
Hello all in this issue,
Please note that this use is discouraged and not recommended. Please refer to the documentation how to initialize/shutdown the SDK: https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/basic-use.html
The reason you observe memory issues by following this design idea
static std::unique_ptr<AwsLifetime> aws;
is that the order of static object destruction is not determined by the C++ standard, and the SDK and it's dependencies are also using static global objects. After your main
returns, there will be static variables destruction time and some of SDK's global static variables may be destructed before ShutdownAPI is called, resulting in an undefined behavior.
We know that the usage of static global variables including std:: smart pointers and data classes is considered a bad design and we are considering different ways of improving on this aspect. We've already made some improvements to get rid of std:: static objects being destructed prematurely/automatically, such as https://github.com/aws/aws-sdk-cpp/pull/2268. I suggest to try to update to the latest version of the SDK available (such as 1.11.6). However, we are not there yet to declare SDK usage such as mentioned above as fully supported.
Best regards, Sergey
@SergeyRyabinin - Can you add to the documentation for ShutdownAPI that calls to it from within the destructor of a static is specifically unsupported (maybe that's already there?)
We know that the usage of static global variables including std:: smart pointers and data classes is considered a bad design and we are considering different ways of improving on this aspect. We've already made some improvements to get rid of std:: static objects being destructed prematurely/automatically, such as https://github.com/aws/aws-sdk-cpp/pull/2268. I suggest to try to update to the latest version of the SDK available (such as 1.11.6). However, we are not there yet to declare SDK usage such as mentioned above as fully supported.
It is encouraging that you are moving in that direction and I hope you can get there. I work on Apache Arrow which uses S3. We have our own static state challenges :cold_sweat: and have been moving singletons from static global variables to static local variables that are initialized the first time they are accessed as this gives us a predictable creation (and thus destruction) order.
If this is fixed then we don't have to require users to call Finalize
so it would be a nice feature.
Same problem here trying to call the InitApi
and ShutdownAPI
methods within a Singleton class.
Any updates on this? Does someone know any workaround?
The basic usage of the sdk should look like this, with all of the sdk calls between InitApi
and ShutdownAPI
.
#include <aws/core/Aws.h>
int main(int argc, char** argv)
{
Aws::SDKOptions options;
Aws::InitAPI(options);
{
// make your SDK calls here.
}
Aws::ShutdownAPI(options);
return 0;
}
Please let us know if you run into anymore problems with this sdk.
⚠️COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.