aws-sdk-cpp icon indicating copy to clipboard operation
aws-sdk-cpp copied to clipboard

Undefined behavior in ClientWithAsyncTemplateMethods constructor

Open pitrou opened this issue 6 months ago • 3 comments

Describe the bug

ClientWithAsyncTemplateMethods has the following constructor: https://github.com/aws/aws-sdk-cpp/blob/08f5335f43eabe4a3d7fb5454c655e7b86630b4d/src/aws-cpp-sdk-core/include/aws/core/client/AWSClientAsyncCRTP.h#L47-L56

The static_cast<AwsServiceClientT*>(this) is meant to cast the base this pointer to the corresponding derived this pointer (which can have a different address when multiple inheritance is involved).

The problem, though, is that it happens in the base class constructor, so the derived object is not yet constructed yet. And, therefore, the static_cast to the derived this is undefined behavior.

This is flagged by some of the UBSan builds in Apache Arrow: https://github.com/apache/arrow/issues/46863

Start test: S3FileSystem
/var/folders/x7/ch5v91h56_zbvbd1y2f600dm0000gn/T/RtmpjRMkwO/file8951246f6d2e/_deps/aws-sdk-cpp-src/src/aws-cpp-sdk-core/include/aws/core/client/AWSClientAsyncCRTP.h:51:40: runtime error: downcast of address 0x61b00011cd98 which does not point to an object of type 'Aws::S3::S3Client'
0x61b00011cd98: note: object is of type 'Aws::Client::AWSXMLClient'
 00 00 00 00  b0 eb b9 28 01 00 00 00  75 73 2d 65 61 73 74 2d  31 00 e6 8b 01 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'Aws::Client::AWSXMLClient'
    #0 0x0001245e7770 in Aws::Client::ClientWithAsyncTemplateMethods<Aws::S3::S3Client>::ClientWithAsyncTemplateMethods()+0x1f0 (arrow.so:arm64+0x701f770)
    #1 0x0001245ebe10 in Aws::S3::S3Client::S3Client(std::__1::shared_ptr<Aws::Auth::AWSCredentialsProvider> const&, std::__1::shared_ptr<Aws::Endpoint::EndpointProviderBase<Aws::S3::S3ClientConfiguration, Aws::S3::Endpoint::S3BuiltInParameters, Aws::S3::Endpoint::S3ClientContextParameters>>, Aws::S3::S3ClientConfiguration const&)+0x310 (arrow.so:arm64+0x7023e10)

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

Expected Behavior

Undefined behavior should not occur when constructing S3Client.

Current Behavior

Undefined behavior is detected when constructing S3Client. See detailed messages above.

Reproduction Steps

The Apache Arrow CI job is too complicated to describe here, sorry.

Possible Solution

A possible solution would be to pass the derived pointer explicitly to the ClientWithAsyncTemplateMethods constructor, so that no static_cast is required.

        ClientWithAsyncTemplateMethods(AwsServiceClientT* derived)
         : m_isInitialized(true),
           m_operationsProcessed(0)
        {
            Aws::Utils::ComponentRegistry::RegisterComponent(AwsServiceClientT::GetServiceName(),
                                                             derived,
                                                             &AwsServiceClientT::ShutdownSdkClient);
        }

Additional Information/Context

No response

AWS CPP SDK version used

1.11.594

Compiler and Version used

Operating System and version

pitrou avatar Jun 24 '25 11:06 pitrou

This sounds like the issue raised to sanitizer https://github.com/google/sanitizers/issues/1773 This may be a false positive warning.

CRTP is a valid design pattern used for compile time polymorphism. From what I understand, static_cast<Derived*>(this)->SomeMethod() works because Compiler generates the function code without needing Derived object. It knows the "blueprint" of Derived class. The actual object creation and function calls happen at runtime.

sbera87 avatar Jun 24 '25 16:06 sbera87

This sounds like the issue raised to sanitizer google/sanitizers#1773 This may be a false positive warning.

Thanks for the link. Yes, it might be a false positive, but C++ is complicated. I don't know about the actual standard, but the relevant item in the C++ reference seems to be item 7e under https://en.cppreference.com/w/cpp/language/static_cast.html . In particular:

If expression is not a null pointer value and does not actually point to a base class subobject of an object of type Derived, the behavior is undefined.

Does the expression point to "a base class subobject of an object of type Derived" here? Well, the object of type Derived is not constructed yet (we're in the base class constructor), so "no" sounds like a valid answer...

pitrou avatar Jun 24 '25 16:06 pitrou

This sounds like the issue raised to sanitizer google/sanitizers#1773 This may be a false positive warning.

Ok, this is an archived repository. Upstream issue should now be https://github.com/llvm/llvm-project/issues/59060

pitrou avatar Jun 25 '25 07:06 pitrou