aws-sdk-java icon indicating copy to clipboard operation
aws-sdk-java copied to clipboard

Direct memory leak when getting object metadata inside parallelStream with Java 17

Open winzsanchez opened this issue 2 years ago • 2 comments

Describe the bug

When calling client.getObjectMetadata(bucket, key) inside a parallelStream(), we notice the direct memory usage going up when using Java 17 (and there are multiple objects in the bucket).

This does not happen with Java 11 or when using stream().

Expected Behavior

Direct memory shouldn't be going up.

Current Behavior

When using Java 17 we see the following:

[MEM] Initial memory used: 16384
[MEM] Memory used after serial stream: 16384
[MEM] Memory used after parallel stream: 57344
[MEM] Memory used after serial stream: 57344

With Java 11:

[MEM] Initial memory used: 8192
[MEM] Memory used after serial stream: 8192
[MEM] Memory used after parallel stream: 8192
[MEM] Memory used after serial stream: 8192

Reproduction Steps

    protected static BufferPoolMXBean directMemoryMXBean;

    static {
        for (final BufferPoolMXBean pool : ManagementFactory.getPlatformMXBeans(BufferPoolMXBean.class)) {
            if (pool.getName().equals("direct")) {
                directMemoryMXBean = pool;
            }
        }
    }

    public static void listObjects(AmazonS3 client, String bucketName, String prefix) throws InterruptedException {
        ObjectListing listing = client.listObjects( bucketName, prefix );
        var objectSummaries = listing.getObjectSummaries();

        while (listing.isTruncated()) {
            listing = client.listNextBatchOfObjects (listing);
            objectSummaries.addAll (listing.getObjectSummaries());
        }
        System.out.println("objectSummaries: " + objectSummaries);

        System.out.println("[MEM] Initial memory used: " + directMemoryMXBean.getMemoryUsed());
        objectSummaries.forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after serial stream: " + directMemoryMXBean.getMemoryUsed());

        objectSummaries.parallelStream().forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after parallel stream: " + directMemoryMXBean.getMemoryUsed());

        objectSummaries.forEach(printMetadata(client, bucketName));
        System.out.println("[MEM] Memory used after serial stream: " + directMemoryMXBean.getMemoryUsed());
    }

    private static Consumer<S3ObjectSummary> printMetadata(AmazonS3 client, String bucketName) {
        return s -> System.out.println(s + " metadata: " + getMetadata(client, bucketName, s.getKey()));
    }

    public static ObjectMetadata getMetadata(AmazonS3 client, String bucketName, String name) {
        return client.getObjectMetadata(bucketName, name);
    }

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

1.12.606

JDK version used

java version "17.0.9" 2023-10-17 LTS Java(TM) SE Runtime Environment (build 17.0.9+11-LTS-201) Java HotSpot(TM) 64-Bit Server VM (build 17.0.9+11-LTS-201, mixed mode, sharing)

Operating System and version

Windows 10

winzsanchez avatar Dec 07 '23 18:12 winzsanchez

@winzsanchez can you generate the SDK client-side metrics? We do generate metrics about used memory, it would be interesting to see a comparison of the cases side by side.

For instructions on how to generate the client-side metrics please check our Developer Guide. Also check our blog post that shows how to interpret the metrics: Tuning the AWS SDK for Java to Improve Resiliency - this is mostly about timeouts and retries, not so much on memory usage, but it's informative.

debora-ito avatar Dec 13 '23 23:12 debora-ito

JvmMetric-2023_12_18_10_45_00-2023_12_18_11_20_00-UTC-5.csv

image

Hi @debora-ito, attached are the memory metrics. The ones from 10:45 to 10:55 are with Java 11. 11:05 to 11:20 are with Java 17

winzsanchez avatar Dec 18 '23 16:12 winzsanchez

@winzsanchez

We didn't have the chance to troubleshoot this further. We are closing old v1 issues before going into Maintenance Mode, so I recommend you check if this issue still persists in v2 and open a new issue in the v2 repo.

Reference:

  • Announcing end-of-support for AWS SDK for Java v1.x effective December 31, 2025 - blog post

debora-ito avatar Jul 29 '24 23:07 debora-ito

This issue is now closed.

Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Jul 29 '24 23:07 github-actions[bot]