aws-sdk-java
aws-sdk-java copied to clipboard
S3 upload stream with http throws stream mark and reset error
I am trying to stream a file straight into S3 rather than upload/buffer into our own server and reupload into S3.
When I use http aws client is trying to calculate message digest and failed to reset the stream. Further, I haven't set a explicit read limit, so it default to 128kb and im uploading stream larger than that.
As per the AWS client code, it set the mark() to the request read limit and then it reads the whole stream, which is beyond the mark() and try to reset() it. Which is obviously going to fail and throw the reset error.
Note: When im using
httpsthis wont happen as signing is disabled by default, but u will face the same withhttpsif you enable signing.
AWS4Signer.java
protected String calculateContentHash(SignableRequest<?> request) {
InputStream payloadStream = getBinaryRequestPayloadStream(request);
ReadLimitInfo info = request.getReadLimitInfo();
payloadStream.mark(info == null ? -1 : info.getReadLimit());
String contentSha256 = BinaryUtils.toHex(hash(payloadStream));
try {
payloadStream.reset();
} catch (IOException e) {
throw new SdkClientException(
"Unable to reset stream after calculating AWS4 signature",
e);
}
return contentSha256;
}
AbstractAWSSigner.java
protected byte[] hash(InputStream input) throws SdkClientException {
try {
MessageDigest md = getMessageDigestInstance();
@SuppressWarnings("resource")
DigestInputStream digestInputStream = new SdkDigestInputStream(input, md);
byte[] buffer = new byte[1024];
while (digestInputStream.read(buffer) > -1)
;
return digestInputStream.getMessageDigest().digest();
} catch (Exception e) {
throw new SdkClientException(
"Unable to compute hash while signing request: "
+ e.getMessage(), e);
}
}
Exception thrown,
Caused by: com.amazonaws.SdkClientException: Unable to reset stream after calculating AWS4 signature
at com.amazonaws.auth.AWS4Signer.calculateContentHash(AWS4Signer.java:562)
at com.amazonaws.services.s3.internal.AWSS3V4Signer.calculateContentHash(AWSS3V4Signer.java:118)
at com.amazonaws.auth.AWS4Signer.sign(AWS4Signer.java:233)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1210)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1749)
at com.platform.common.services.S3BinaryUploadService.uploadBinaryToUploadBucket(S3BinaryUploadService.java:61)
... 84 common frames omitted
Caused by: java.io.IOException: Resetting to invalid mark
at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
at com.amazonaws.util.LengthCheckInputStream.reset(LengthCheckInputStream.java:126)
at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
at com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream.reset(MD5DigestCalculatingInputStream.java:105)
at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168)
at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
at com.amazonaws.auth.AWS4Signer.calculateContentHash(AWS4Signer.java:560)
... 98 common frames omitted
This is a known issue and a current limitation of the SDK. There are similar posts with workarounds. Please refer to them and see if they work for you. https://github.com/aws/aws-sdk-java/issues/427#issuecomment-273550783 https://github.com/aws/aws-sdk-java/issues/474
Thanks a lot for the response. I saw ur answer before, but what I am trying to do here is, stream a file straight from user into S3 rather than download/buffer into our server. Thus, I don't have the file, so option 1 is out for me.
Yes, I can set the read limit beyond the max expected file size, but then in that case aws-sdk will read the whole file in memory to do the signing (and fail with the exception), which is what I want to avoid. Because this api expect large binaries which could go close to a GB.
By the way, I know this will be solved, by using
https, but wanted to raise this so it will be solved in future. (atleast stop failing by fixing mark and reset issue)
@thisarattr Unfortunately there's no way around this as the SDK needs to consume the full contents of the stream (which in this case requires buffering the stream to memory) to be able to set the checksum as part of the request signature. The easiest way around this would be to switch to using an HTTPS endpoint if possible.
By the way, I know this will be solved, by using https, but wanted to raise this so it will be solved in future. (atleast stop failing by fixing mark and reset issue)
It sounds like this is a feature request so I'll mark it as such for now, but I'm not sure how we'll be able to avoid this.
@dagnir I agree that, when it uses http there is no way to calculate the hash/checksum without buffering in memory. But still, it should not fail by throwing mark and reset exception, right?
Because, hashing is client lib responsibility, api consumer does not need to know about it. It should throw meaningful error message instead of mark and reset exception, which does not mean much to the consumer, without looking at the client lib code.
Okay I see; we can certainly throw/log a more descriptive error message.
Could we actually have a specific subclass of SdkClientException for these retryable signing/hashing problems? The Hadoop S3A client already splits failures into those which may be recoverable (no response, throttle errors, socket timeouts etc and then decides which to retry.
We are closing stale v1 issues before going into Maintenance Mode.
If this issue is still relevant in v2 please open a new issue in the v2 repo.
Reference:
- Announcing end-of-support for AWS SDK for Java v1.x effective December 31, 2025 - blog post
This issue is now closed.
Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.
FYI as HADOOP-19221 shows, v2 SDK actually makes things worse in terms of s3 upload recoverability.