aws-sdk-java-v2 icon indicating copy to clipboard operation
aws-sdk-java-v2 copied to clipboard

In S3 library `ResponseInputStream<?>` doesn't seem to support the `InputSteam` `int read(byte[] buffer)` method correctly

Open esteeele opened this issue 1 year ago • 1 comments

Describe the bug

When requesting the ResponseInputStream<GetObjectRequest> object with then calling read(bytes) with a defined byte too few bytes are read into the array. Comparing the SDKs side-by-side shows that the V1 SDK loads the full size of the byte array whereas the new SDK only loads a subset.

Expected Behavior

I expect the 2 SDKs to work the same i.e. S3Object.getObjectContent() to behave the same as the new response type.

Current Behavior

Using the code below I get a smaller amount of bytes returned than requested (see this in the context of the small program I've submitted)

1388 for AWS version 2
10000 for AWS version 1

If I supply

inputStream.read(bytes, 0, bytes.length);

It works perfectly in old and new

Reproduction Steps

  public void compareOldAndNewDownloads() {
    AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard().withRegion("eu-west-1").build();

    String dummyFile = "/anyFile";
    Path logFile = Path.of(dummyFile);

    String key = "delete-me-" + UUID.randomUUID();
    try (S3Client s3Client = lowHttpPoolClient()) {
      try {
        s3Client.putObject(PutObjectRequest.builder().bucket(destBucket).key(key).build(),
            RequestBody.fromFile(logFile));
      } catch (S3Exception s3Exception) {
        System.out.println("*** V2 SDK ***");

        System.out.println(s3Exception.getMessage());
        System.out.println(s3Exception.awsErrorDetails().errorMessage());
      }

      ResponseInputStream<GetObjectResponse> is = s3Client.getObject(GetObjectRequest.builder().bucket(destBucket).key(key).build());
      loadBytes(is, true);
    }

    InputStream is = amazonS3.getObject(destBucket, key).getObjectContent();
    loadBytes(is, false);
  }

  private void loadBytes(InputStream is, boolean newAws) {
    byte[] bytes = new byte[10_000];
    int result;
    try {
      result = is.read(bytes);
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
    System.out.println(result + " for AWS version " + (newAws ? 2 : 1));
  }

  private static S3Client lowHttpPoolClient() {
    SdkHttpClient apacheHttpClient = ApacheHttpClient.builder()
        .maxConnections(5)
        .build();
    return S3Client.builder()
        .httpClient(apacheHttpClient)
        .defaultsMode(DefaultsMode.IN_REGION)
        .region(Region.EU_WEST_1)
        .overrideConfiguration(ClientOverrideConfiguration.builder()
            .build())
        .build();
  }

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

s3:2.25.11

JDK version used

openjdk version "21.0.2" 2024-01-16 LTS

Operating System and version

macOS 14.5 (23F79)

esteeele avatar Jul 09 '24 14:07 esteeele

bad news: java sdk says any number less than the requested number may be returned

  An attempt is made to read as many as
  len bytes, but a smaller number may be read.
  The number of bytes actually read is returned as an integer.

reading from any input stream correctly requires you to iterate until the full value is read or a -1 comes back

steveloughran avatar Aug 12 '24 15:08 steveloughran

Closing this as the InputStream interface makes no guarantees about the result returned by implementations but it is worth having as a reference in case anyone else encounters this moving from V1 -> V2

esteeele avatar Dec 09 '24 09:12 esteeele

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Dec 09 '24 09:12 github-actions[bot]

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

github-actions[bot] avatar Dec 09 '24 09:12 github-actions[bot]