amazonka
amazonka copied to clipboard
S3: Setting endpoint interferes with vhost style URLs
All s3 requests call s3vhost
after building the request. This moves the bucket name from the first segment in the path to the host name of the request URL. This only works as long as the S3 Service doesn't have the endpoint overridden. I tested this by using setEndpoint
, I couldn't find any other ways of overriding this endpoint. When the endpoint is overridden, neither the path nor the hostname have the bucket name.
My usecase is to support S3-compatible storage services like minio.
Additionally, it would be great if there was a way to specify whether I want to use path style URLs or vhost style URLs.
Yeah, it's a bit hacky. It would be good if we did something better and supported other S3-compatible object stores.
What does correct behaviour look like here? I'm not familiar with non-S3 stores. Would it be acceptable if we only did the rewriting when the endpoint matched s3.amazonaws.com
or s3.#{region}.amazonaws.com
?
Also, how does request signing work when you're talking to a non-AWS (non-IAM?) service?
What does correct behaviour look like here? I'm not familiar with non-S3 stores. Would it be acceptable if we only did the rewriting when the endpoint matched
s3.amazonaws.com
ors3.#{region}.amazonaws.com
?
Probably not, because I think there are other storages which already work with vhost style, like Google Cloud Storage, Scaleway Object Storage and even Minio (if configured). I think the current behavior is not too bad except that the bucket name shouldn't go missing in the final request made.
Additionally, though, I think it would be nice if this whole feature could be toggled on or off.
Also, how does request signing work when you're talking to a non-AWS (non-IAM?) service?
I don't understand how exactly the whole request signing and verification works. But I know it does work because my first attempt to make this work with minio was to add some hooks into the HTTP Manager to change the URL from vhost style to path style just before executing the requests. But the requests failed with 403 responses complaining about invalid signature.
I would love to see better support for non-S3 support stores. Unfortunately, I don't have capacity to research and build something like that right now, and my priority is on fixing the generator.
I am happy to help out where I can, by reviewing and/or merging PRs or providing a sounding board for design.
There's probably enough variation between object store services that the best thing to do is provide some request-rewriting hooks in the Env
or something (probably one at the AWSRequest
level, and one at the lower http-client
level, that users can override if the one in amazonka
doesn't do what's required. See #737.
Reclassifying as a bug and adding it to 2.0 since it's a regression that affects @basvandijk.
Oh, I can chime in on this a bit.
I was just trying to use Amazonka to replace the minio-hs
library. I was looking for the robustness of Amazonka's credential discovery and management options.
This took me an afternoon to figure out. I was so confused as to why my bucket wasn't in the request. I eventually tracked it down and a quick search in the issues brought me here.
It seems like some hosts may support vhosts for buckets and others might not. We'll never build a completely exhaustive list so we need a way to configure it. I think that makes the most sense at the service level and not the request level because it's more about the service endpoint than anything else. However, only the request actually knows the information that may need to go there.
What comes to mind is:
- A typeclass for requests to surface what part of their request should be used as a vhost, if any, e.g.
ToVHost a => a -> Maybe ByteString
- Change
Service
to include_serviceEndpoint :: Maybe ByteString -> Region -> ByteString
. Most existing services would just need aconst
prepended.
I'm going to evaluate some alternatives right now before I commit to writing a PR for something like that. We have some non-AWS deployments that use a Minio server to provide an object store while our cloud uses S3. I'll need to make sure my solution supports both options.
Other than this little issue and just having to spend a lot of time learning all the types, it's a great library! Keep it up!
I just had some shower thoughts. Clearly it would have be more complicated that this because the endpoint and the path are sort of a pair. They can pass through unchanged or the VHost rewriting could end up rewriting both. So maybe it's that the request's instance needs a way to take in the service and return the path and the host. That's getting very close to a class that returns the URI instead of just path, query separately.
This is the config that tells the generator to wire in vhost rewriting:
https://github.com/brendanhay/amazonka/blob/26466244b5a23b29ba8894ec89d922e6a6835a05/configs/services/s3.json#L7-L36
This is where the rewrite happens:
https://github.com/brendanhay/amazonka/blob/26466244b5a23b29ba8894ec89d922e6a6835a05/lib/amazonka-core/src/Amazonka/Request.hs#L169-L194
I'm not sure ToVHost
is right, because the service value needs to know whether or not it can do vhost rewriting. And then the s3vhost
function can check that before rewriting the request?
@brendanhay if you've got time, it would be great to have you weigh in.
#797 also has other discussion, including a possible S3VHost setting to go in the Service record.