aws-sdk-java
aws-sdk-java copied to clipboard
Not all s3 requests in sdk provide option for setting Requester pays
Some s3 requests in sdk already support setting requester pays option, for example:
-
DeleteObjectsRequest
-
InitiateMultipartUploadRequest
-
GetObjectMetadataRequest
-
CopyObjectRequest
However there are some requests from which this option is missing:
-
GetBucketLocationRequest
-
ListObjectsRequest
And we are seeing 403 error for these requests. As a workaround we are setting requester pays option via custom header. Something like:
request.putCustomRequestHeader("x-amz-request-payer", "requester");
We are using aws-sdk version 1.11.158 .
Thanks for the report. It looks like these are the requests missing this option, according to the model: ListObjectsRequest ListObjectsV2Request
It doesn't seem like GetBucketLocation supports requester-pays according to the model file that S3 publishes to us. Have you verified that it is working for these requests? http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGETlocation.html
If so, I can contact the service to update their documentation.
Edit: I've attempted to use requester-pays for GetBucketLocation, and it didn't seem to work. I'll move forward and make the change for ListObjectsRequest, though! Let me know if I missed something.
A change has been made to add an withRequesterPays(true)
to ListObjectsRequest
and ListObjectsV2Request
. It will go out with the next release. Let me know if there's anything else I can help with!
@millems : Apologies for the delay and thanks for your response.
ListObjectsRequest
is working after upgrading sdk version to 1.11.160.
However I am seeing issues with DeleteObjectRequest
. This request provides the option of setting requester pays but somehow this is not being sent in headers to s3. The way I am using:
DeleteObjectRequest request = new DeleteObjectRequest(bucket, key);
request.setRequesterPays(true);
s3.deleteObject(request);
I see the following in the debug logs:
17/07/07 07:17:27 DEBUG amazonaws.request: Sending Request: DELETE http://sourabhg-test.s3.amazonaws.com /myFile.txt Headers: (User-Agent: aws-sdk-java/1.11.160 Linux/4.9.27-14.31.amzn1.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/24.65-b04/1.7.0_67, amz-sdk-invocation-id: a9534745-9dc2-7b12-4115-d63990f7b691, Content-Type: application/octet-stream, )
Note that requester-pays is missing from headers. Probably populateRequesterPaysHeader ()
is missing from deleteObject in AmazonS3Client.java.
Let me know if I am missing something.
@millems : Did you get a chance to look at my previous comment ?
@sourabh912: Not until now :). I'm sorry we're causing you so much trouble with our requester-pays implementation. We're auto-generating the S3 SDK in the next revision of the SDK. Once it's production-ready we won't be able to miss these things.
I'll look into the DeleteObject
issue and the other APIs that support requester-pays to make sure everything is working as it should be.
While ListObjectRequest now includes requester-pays support, ListNextBatchOfObjectsRequest does not. Looks simple enough to recreate with support, but ... would be easier if ObjectListing carried this for us from the original request.
Sorry that this was ignored for so long. We're super busy with 2.x. Could you submit a pull request for this? It sounds like a small change that we should be able to pull in.
This is the solution for my case, regarding this error with S3A and Spark (but I believe it can be replicated in other envs). The key is the fs.s3a.s3.client.factory.impl properties' value. By default, this value is set to DefaultS3ClientFactory.
So what's wrong with this factory? Well, it doesn't include any requester-pays related header, as seen in its source code The solution is to implement a custom client factory extending the default one but adding the request-payer header in the awsConfig.
public class RequestPayerS3ClientFactory extends DefaultS3ClientFactory {
@Override
protected AmazonS3 newAmazonS3Client(AWSCredentialsProvider credentials, ClientConfiguration awsConf)
{
awsConf.addHeader("x-amz-request-payer","requester");
return new AmazonS3Client(credentials, awsConf);
}
}
This is just a simplification, as you could also check if the value is set to true before adding the header. This factory asssumes all request will be payed by the requester if needed.
Once you compile the class and add it to your classpath, set the new hadoopConfiguration:
spark.sparkContext.hadoopConfiguration.set("fs.s3a.s3.client.factory.impl", "your.package.RequestPayerS3ClientFactory")
This way the S3 requests will call the overriden newAmazonS3Client method, now including the x-amz-request-payer header.
This is a very old issue that is probably not getting as much attention as it deserves. We encourage you to check if this is still an issue in the latest release and if you find that this is still a problem, please feel free to provide a comment or open a new issue.