aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

Initializing GeneratedServiceMetadataProvider takes a while

Open awmcc90 opened this issue 2 years ago • 6 comments

Describe the feature

When creating a service client (e.g. DynamoDbAsyncClient) there is an expensive one time call to create GeneratedServiceMetadataProvider which incurs the penalty of an expensive static initializer. The static initializer creates a map object with 299 entries (currently), each of which also initializes an instance that extends ServiceMetadata which also all have large static initialization overhead.

From the looks of it, every single instance of ServiceMetadata is initialized but in my use case, only DynamodbServiceMetadata was needed. And this call was only needed to acquire the client endpoint.

I profiled just the cold start initialization of GeneratedServiceMetadataProvider to confirm my suspicions and get a general idea of the overhead incurred. On my machine the operation took between 350 and 500 ms. I think this could be improved significantly by implementing lazy loading of ServiceMetadata objects.

Benchmark

@State(Scope.Benchmark)
@Warmup(iterations = 0)
@Measurement(iterations = 1)
@Fork(20)
public class GeneratedProviderBenchmark {

    @Benchmark
    @BenchmarkMode(Mode.SingleShotTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    public void createObject() {
        GeneratedServiceMetadataProvider generated = new GeneratedServiceMetadataProvider();
        ServiceMetadata metadata = generated.serviceMetadata("xray");
    }

    public static void main(String... args) throws RunnerException, CommandLineOptionException {
        Options opt = new OptionsBuilder()
            .parent(new CommandLineOptions())
            .include(GeneratedProviderBenchmark.class.getSimpleName())
            .addProfiler(StackProfiler.class)
            .build();
        Collection<RunResult> run = new Runner(opt).run();
    }
}

Results:

Benchmark Mode Cnt Score Error Units
GeneratedProviderBenchmark.createObject ss 20 364.140 ± 16.565 ms/op

Use Case

The goal is to reduce the overhead as much as possible and make cold start times as performant as they can be. This is an ongoing process that a number of other tickets are related to. In fact the ticket that lead to the creation of the GeneratedServiceMetadataProvider class was done so in an effort to reduce cold start times of the DynamoDbClient.

Proposed Solution

Lazily initialize the GeneratedServiceMetadataProvider using a static factory method. I created a proof of concept to test a possible improvement to the object initialization.

public final class GeneratedServiceMetadataProvider implements ServiceMetadataProvider {
    private static final Map<String, ServiceMetadata> SERVICE_METADATA = new HashMap<>();

    private static ServiceMetadata createServiceMetadata(String endpointPrefix) {
        switch (endpointPrefix) {
            case "a4b":
                return new A4bServiceMetadata();
            ... // other cases      
            default:
                throw new IllegalStateException("Unexpected value: " + endpointPrefix);
        }
    }

    private static ServiceMetadata getServiceMetadata(String endpointPrefix) {
        if (!SERVICE_METADATA.containsKey(endpointPrefix)) {
            SERVICE_METADATA.put(endpointPrefix, createServiceMetadata(endpointPrefix));
        }
        return SERVICE_METADATA.get(endpointPrefix);
    }

    @Override
    public ServiceMetadata serviceMetadata(String endpointPrefix) {
        return getServiceMetadata(endpointPrefix);
    }
}

This produced the following results using the same benchmark as above:

Benchmark Mode Cnt Score Error Units
GeneratedProviderBenchmark.createObject ss 20 14.140 ± 0.465 ms/op

About a 25x improvement in object creation speed, not to mention the memory overhead thats saved by deferring initialization of all the other ServiceMetadata instances that aren't needed.

Other Information

As an aside, I was surprised that providing an EndpointProvider to the client builder - one which doesn't go through GeneratedServiceMetadataProvider - wasn't used at all when setting the endpoint of the client on creation. Why wouldn't the client builder use the endpoint provider, if present, if it's going to use the provider for every request anyway?

Related issues: https://github.com/aws/aws-sdk-java-v2/issues/748 https://github.com/aws/aws-sdk-java-v2/issues/6

Acknowledgements

  • [X] I may be able to implement this feature request
  • [ ] This feature might incur a breaking change

AWS Java SDK version used

2

JDK version used

1.8

Operating System and version

macOS Monterey 12.6.1

awmcc90 avatar Jan 03 '23 02:01 awmcc90

Hi @awmcc90 thank you for the thorough report! We recently changed to a use EndpointProvider for endpoint resolution in place of generated service metadata classes, and I think it makes sense to lazily initialize it. We will take a look at your PR shortly. Thanks again for the detailed information.

As to the endpoint provider issue, it seems like a bug, could you create a new issue for it?

zoewangg avatar Jan 13 '23 01:01 zoewangg

Thanks for your reply. I'm not sure I'm the right person to make the ticket about the EndpointProvider because I'm not sure how it's supposed to work. The only use of EndpointProvider that I can find are in various ExecutionInterceptor classes. Are you saying that GeneratedServiceMetadataProvider should never be called given there is always a default EndpointProvider?

Also, while looking into this I noticed that some of the codegen results in dead code. One example of that is this line in AsyncClientClass. That line needs to be inside the if statement on line 293 or else a variable is declared that is never used. In DefaultDynamoDbAsyncClient there are 38 instances of this unused variable declared. Should I create a ticket for that too?

awmcc90 avatar Jan 13 '23 14:01 awmcc90

Are you saying that GeneratedServiceMetadataProvider should never be called given there is always a default EndpointProvider?

Yeah, we are planning to make the change to remove the usage ofGeneratedServiceMetadataProvider from the request path.

That line needs to be inside the if statement on line 293 or else a variable is declared that is never used. In DefaultDynamoDbAsyncClient there are 38 instances of this unused variable declared. Should I create a ticket for that too?

Good catch, could you create a ticket for it?

zoewangg avatar Jan 17 '23 23:01 zoewangg

Beyond being slow, this also allocates ~10MB based on a test with async-profiler. If none of this data is ever retrieved, then this is very wasteful, creating a spike of memory allocation.

MikeDombo avatar Apr 24 '23 23:04 MikeDombo

Any update on this? I made a PR but that's way out of sync at this point.

awmcc90 avatar Aug 10 '23 13:08 awmcc90

Second this. Please fix the slow as hell startup of AWS Java SDKs

RaffaeleSgarro avatar Aug 16 '25 21:08 RaffaeleSgarro