aws-sdk-ruby
aws-sdk-ruby copied to clipboard
Please make the Endpoint accessible on Aws::S3::Errors::PermanentRedirect
Is your feature request related to a problem? Please describe.
When accessing buckets in bulk, they may be in different regions. A client with region X can access buckets in that region but when accessing a bucket in region Y, an Aws::S3::Errors::PermanentRedirect is raised. The error message returned by the API includes the Endpoint:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint>this.is.not.available.in.Ruby</Endpoint><Bucket>[redacted]</Bucket><RequestId>[redacted]</RequestId><HostId>[redacted]</HostId></Error>
but this is not accessible in Ruby.
Describe the solution you'd like
When rescuing Aws::S3::Errors::PermanentRedirect, I'd like to access the endpoint so we can reconfigure the client for this bucket in subsequent requests. This is what awscli does internally.
Describe alternatives you've considered
We are currently rescuing Aws::S3::Errors::PermanentRedirect, followed by a call to get_bucket_location so we can determine which region is the right one. This requires an additional IAM permission which would not be necessary if the SDK returned all information from the API.
[ ] :wave: I may be able to implement this feature request
Additional context Add any other context or screenshots about the feature request here.
We're looking into how we can more gracefully handle or expose this information.
However, in the meantime the correct bucket region is exposed through the response headers which are available in the context:
begin
resp = s3.get_object(bucket: bucket, key: key)
rescue Aws::S3::Errors::PermanentRedirect => e
correct_bucket_region = e.context.http_response.headers['x-amz-bucket-region']
# create a new client with that region and retry request...
end
Greetings! We’re closing this issue because it has been open a long time and hasn’t been updated in a while and may not be getting the attention it deserves. We encourage you to check if this is still an issue in the latest release and if you find that this is still a problem, please feel free to comment or open a new issue.
Just a guess, but could these 301 redirects be handled here?
https://github.com/aws/aws-sdk-ruby/blob/cfd0f0d05ad5e56cb00eee1dc81a244ca4c695f4/gems/aws-sdk-s3/lib/aws-sdk-s3/plugins/s3_signer.rb#L118-L122
I've noticed that the SDK already figures out the correct region automatically, with some exceptions. Maybe choking on bucket names with dots?
If it can't be handled automatically, it might help to at least explain what the limitations are. According to a comment, the retry logic "is intended for s3's global endpoint which will return 400 if the bucket is not in region." And there seems to be some disagreement within AWS about whether or not we're all horrible people for trying to access S3 that way. 😄
SDKs and S3 had decided NOT to handle 301 redirects and instead have customers update their code to use the correct region for the bucket. S3 has been moving away from global endpoints for various reasons (performance, ops) and prefers regional and account based buckets instead. We have code to transition away from those global endpoints. I do think we can expose the endpoint/error more gracefully but I think automatic retry for a permanent redirect is not a great idea.
That could be a logical decision, given that there is a workaround. The confusing bits are:
- Some redirects are handled automatically, while others aren't. That s3_signer.rb code automatically redirects 90% of buckets. The only buckets that ever see a 301 redirect are the ones with dots in the name. (Yeah technically it's not an "HTTP Redirect" since it's a 400, but the result is the same.)
- That means I can test my code with a bunch of buckets, show that everything works great, and then on production someone will immediately try it with a dot and tell me it doesn't work. 😄
That first bullet (along with other inconsistent messaging, abandoned plans, etc.) gives the impression that this issue is still undecided within AWS. There's one group that wants everyone to stop various legacy stuff, certainly with valid technical points and supporting evidence. Another group wants to just do what customers want, which is often a good idea too. Asking for a region is an inferior UI in many situations.
Removing that auto-redirect logic for 400 status codes would be slightly more consistent, but that still wouldn't fix the second bullet. Everyone would still have to realize that they need to test with buckets with dots and without dots, triggering two different response codes, and then handle them differently.
If there's documentation showing how to access S3 buckets without knowing the region, I haven't found it. I'd be happy to use the "one true way" for doing that! (Alex's workaround seems to work just fine, too.)
Two solutions that might make sense:
- Add a
region: autooption, with a way to discover the final region after the request completes. - Make
get_bucket_location(or something similar) work without needing extra IAM permissions, and then tell us to use that. By forcing us to handle the region name, we all might be more likely to cache it somewhere permanently.
Buckets with dots are not valid DNS host labels, so they cannot be prepended to the URL. I.e. with "my.bucket" we cannot route to my.bucket.s3.us-west-2.amazonaws.com. They are instead paths of /my.bucket/key. The legacy global endpoint is path based only, and regional endpoints accept both host label and path style addressing.
⚠️COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
I've merged in a fix (#2838) that should address this. You should be able to view the bucket, region, and endpoint from the error data.
If anyone can find documentation fully describing <Endpoint>…</Endpoint>, <Bucket>…</Bucket>, and x-amz-bucket-region or these new endpoint/bucket/region data members, please let me know. I've managed to get something working, but it still feels like guesswork.
Ideally such documentation would also describe which redirects are handled automatically, and which redirects require extra code for me to handle. (So that I can test all the various scenarios to make sure my code works.)