botocore
botocore copied to clipboard
Signed header using on host:port instead of host only
Describe the bug
At our company. we use 0-trust solution that puts most services (including https) behind non standard local port (e.g 7443 instead of 443)
Consequently, this bit breaks the signing
https://github.com/boto/botocore/blame/5ba1dc100324723b56ff6953350d8218a85b63bf/botocore/auth.py#L83-L85
Because if I am not mistaken the SDK should only use the host bit (e.g 127.0.0.1). But as you can see on line 85, host is being modified if my port does not match your default_ports scheme.
To wit: Canonical header should not have host:port but rather just host regardless if I am not sorely mistaken. And sdk should not confuse endpoint (for making requests) with what goes into awsv4 headers.
To make matter worse. looks like Golang sdk copy pasted the same bits of code
Expected Behavior
canonical header should only contain host
endpoint = '127.0.0.1:7443' # endpoint is not the same thing as host for signing purposes
host = '127.0.0.1'
canonical_headers = "\n".join([f'{k}:{v}' for k,v in {'host':host,
'x-amz-content-sha256': empty_string_hash,
'x-amz-date': amzdate,
'x-amz-security-token': credentials.token,
}.items()])
Current Behavior
Getting
"message":"The request signature we calculated does not match the signature you provided. ...
Reproduction Steps
Repeat the aws example https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html
with host and host:port to see how this is getting messed up
Possible Solution
No response
Additional Information/Context
No response
SDK version used
botocore==1.29.110, boto3==1.26.97
Environment details (OS name and version, etc.)
py3.9, mac
Hi @aimran-adroll, thanks for reaching out about this. We do intentionally include the port if it's provided and not implicit with the scheme (http -> 80 and https -> 443). The HTTP RFC for a Host header is defined as (emphasis mine):
The "Host" header field in a request provides the host and port information from the target URI, enabling the origin server to distinguish among resources while servicing requests for multiple host names.
You won't typically encounter this scenario with a production AWS service though. Could you provide some more details about your use case and where you found information on the exclusion of the port in signing? Thanks!
We use Banyan for managing our corporate network. Tldr: Something sits in the middle to mediate access between user (aka my laptop) and internal resources (aws services). Consequently, instead of directly hitting https://awsendpoint.vpc.aws.com, we typically have an app that maps awsendpoint to localhost:PORT
In any case, here is an example where requests_aws4auth library does the right thing but aws sdks (using underlying botocore) fails. I am using opensearch as an example but pretty sure that its going to be the same issue for any other using botocore with similar setup.
import boto3
from opensearchpy import OpenSearch, AWSV4SignerAuth, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
credentials = boto3.Session().get_credentials()
## THIS WORKS (does _not_ uses botocore)
auth = AWS4Auth(credentials.access_key, credentials.secret_key,
"us-west-2", "es", session_token=credentials.token)
## !!!!! THIS DOES NOT
## (dont be thrown off by AWSV4SignerAuth. it uses merely uses botocre
auth = AWSV4SignerAuth(credentials, "us-west-2")
### everything else remaining same
c = OpenSearch(
hosts=[{"host": "localhost", "port": 7443}],
http_auth=auth, # <---------------------- switch
connection_class=RequestsHttpConnection,
use_ssl=True,
verify_certs=False,
ssl_show_warn=False,
)
After spending absurdly crazy amount of time, i am fairly convinced its the use of host:port in the canonical signature thats messing things up. I verified it by crafting the signature by hand and issuing raw requests
Thanks for the info, @aimran-adroll. To quickly clarify, the Host header you're sending in your raw request to OpenSearch is localhost, or is this being transformed somewhere else? Can you provide the exact values being passed into Boto3 and the Host header you're expecting?
Taking a look at requests_aws4auth it looks like they just added tests specifically to ensure the port is preserved on the Host header. It seems there's some confusion around the expected behavior and this may be an issue specific to OpenSearch. I'm not sure we can modify this behavior without risking breakages to other use cases.
There is also a similar discussion around OpenSearch in the originating issue for that PR. Their specific case is due to setting the endpoint_url to a different domain than what the request is actually hitting. I'm not sure we can easily handle these implicit network mappings in the SDKs.
If you can provide the info requested above, it'll be helpful while we investigate internally. Thanks!
Thanks @nateprewitt for sticking on this. Its a touch difficult to explain but here it goes. I repeated the above (the two ways of generating auth). I then used the debugger to drop down to the following places
in both cases, I am passing host="localhost:7443"
case 1: request_aws4auth
https://github.com/tedder/requests-aws4auth/blob/9e9e7bf25ad1962cd8dd77064216ee4cab8ca520/requests_aws4auth/aws4auth.py#L415
This is right after it has generated the canonical headers. I inspected the values and host is indeed localhost (sans port)
case 2: using opensearchpy.AWSV4SignerAuth
I dropped down to the following place in botocore
https://github.com/boto/botocore/blob/6caef35f5500ed60eaa9228039fc759233917dba/botocore/auth.py#L245
You can see that host is localhost:7443
Since case 2 fails to sign the request, its reasonable to assume that host in botocore in this particular scenario is not set correctly at the signing phase.
And just to be annoying 😅 , i commented out the following lines in botocore, and 💥 . Case 2 works too
https://github.com/boto/botocore/blob/6caef35f5500ed60eaa9228039fc759233917dba/botocore/auth.py#L83-L85
Ok interesting, so this may work, but is 100% not intended functionality. The Host header is the destination server, not where the request is originating. This is to distinguish which service you're addressing for co-habitated applications on the same server.
It looks like OpenSearch, and potentially other services, are interpreting "localhost" as the service host. I'm actually surprised they are accepting this to begin with, but it may be an unintentional omission.
This is to distinguish which service you're addressing for co-habitated applications on the same server.
That certainly makes sense. I am not sure if there is anything to be done here. I guess I will keep using requests_aws4auth since its unintentionally using uri-host for the host header (atleast until they "fix" it)
Thanks again
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.