aws-cli icon indicating copy to clipboard operation
aws-cli copied to clipboard

sts assume-role-with-web-identity needs a backoff-retry mechanism

Open yannickvr opened this issue 3 years ago • 3 comments

Describe the bug

When using aws sts assume-role-with-web-identity (with Github Actions) the command fails sometimes with a 400 response, possibly due to STS not being able to verify the token against the Github OIDC provider in a timely fashion.

Expected Behavior

aws sts assume-role-with-web-identity always succeeds, assuming configuration is correct

Current Behavior

aws sts assume-role-with-web-identity --role-arn arn:aws:iam::01234567890:role/GitHubActionsRole --role-session-name github --web-identity-token $(cat /tmp/web_identity_token_file) --region eu-west-1

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements

Reproduction Steps

Run the following command in github actions:

aws sts assume-role-with-web-identity --role-arn arn:aws:iam::01234567890:role/GitHubActionsRole --role-session-name github --web-identity-token $(cat /tmp/web_identity_token_file) --region eu-west-1

Where the role should be a role that's set up for your Github repo. Identity token file can be created during the github workflow.

Possible Solution

Retry the action a few times using an acceptable method (backoff-retry) until failing.

Additional Information/Context

022-08-05 13:52:34,591 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=AssumeRoleWithWebIdentity) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': 'aws-cli/2.7.20 Python/3.9.11 Linux/5.15.0-1014-azure exe/x86_64.ubuntu.20 prompt/off command/sts.assume-role-with-web-identity'}, 'body': {'Action': 'AssumeRoleWithWebIdentity', 'Version': '2011-06-15', 'RoleArn': 'arn:aws:iam::0123455678980:role/GitHubActionsRole', 'RoleSessionName': 'github', 'WebIdentityToken': '***'}, 'url': 'https://sts.eu-west-1.amazonaws.com/', 'context': {'client_region': 'eu-west-1', 'client_config': <botocore.config.Config object at 0x7f328f24c3a0>, 'has_streaming_input': False, 'auth_type': None}} 2022-08-05 13:52:34,591 - MainThread - botocore.hooks - DEBUG - Event request-created.sts.AssumeRoleWithWebIdentity: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7f328f24c370>> 2022-08-05 13:52:34,592 - MainThread - botocore.hooks - DEBUG - Event choose-signer.sts.AssumeRoleWithWebIdentity: calling handler <function disable_signing at 0x7f32929be940> 2022-08-05 13:52:34,592 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://sts.eu-west-1.amazonaws.com/, headers={'Content-Type': b'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': b'aws-cli/2.7.20 Python/3.9.11 Linux/5.15.0-1014-azure exe/x86_64.ubuntu.20 prompt/off command/sts.assume-role-with-web-identity', 'Content-Length': '1773'}> 2022-08-05 13:52:34,592 - MainThread - botocore.httpsession - DEBUG - Certificate path: /usr/local/aws-cli/v2/2.7.20/dist/awscli/botocore/cacert.pem 2022-08-05 13:52:34,592 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): sts.eu-west-1.amazonaws.com:443 2022-08-05 13:52:36,245 - MainThread - urllib3.connectionpool - DEBUG - [https://sts.eu-west-1.amazonaws.com:443](https://sts.eu-west-1.amazonaws.com/) "POST / HTTP/1.1" 400 390 2022-08-05 13:52:36,246 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-RequestId': 'f571a4f3-502a-445c-982d-715272b15b01', 'Content-Type': 'text/xml', 'Content-Length': '390', 'Date': 'Fri, 05 Aug 2022 13:52:35 GMT', 'Connection': 'close'} 2022-08-05 13:52:36,246 - MainThread - botocore.parsers - DEBUG - Response body: b'<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">\n <Error>\n <Type>Sender</Type>\n <Code>InvalidIdentityToken</Code>\n <Message>Couldn\'t retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\n </Error>\n <RequestId>f57AAAAf3-502a-445c-982d-715272AAAA5b01</RequestId>\n</ErrorResponse>\n' 2022-08-05 13:52:36,246 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-RequestId': 'f5AAAAA502a-445c-982d-71527AAAA5b01', 'Content-Type': 'text/xml', 'Content-Length': '390', 'Date': 'Fri, 05 Aug 2022 13:52:35 GMT', 'Connection': 'close'} 2022-08-05 13:52:36,246 - MainThread - botocore.parsers - DEBUG - Response body: b'<ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">\n <Error>\n <Type>Sender</Type>\n <Code>InvalidIdentityToken</Code>\n <Message>Couldn\'t retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\n </Error>\n <RequestId>f571aAAAA02a-445c-982AAAA5272b15b01</RequestId>\n</ErrorResponse>\n' 2022-08-05 13:52:36,247 - MainThread - botocore.hooks - DEBUG - Event needs-retry.sts.AssumeRoleWithWebIdentity: calling handler <bound method RetryHandler.needs_retry of <botocore.retries.standard.RetryHandler object at 0x7f328f2656a0>> 2022-08-05 13:52:36,247 - MainThread - botocore.retries.standard - DEBUG - Not retrying request. 2022-08-05 13:52:36,247 - MainThread - botocore.hooks - DEBUG - Event after-call.sts.AssumeRoleWithWebIdentity: calling handler <bound method RetryQuotaChecker.release_retry_quota of <botocore.retries.standard.RetryQuotaChecker object at 0x7f328f265190>> 2022-08-05 13:52:36,247 - MainThread - awscli.clidriver - DEBUG - Exception caught in main()

CLI version used

2.7.20

Environment details (OS name and version, etc.)

Github actions ubuntu-latest

yannickvr avatar Aug 16 '22 08:08 yannickvr

Hi @yannickvr thanks for reaching out. Here's a knowledge center article that addresses the error you reported. One of documentation links on that page is for Error retries and exponential backoff in AWS. For the AWS CLI specifically here is documentation on retries. Have you tried any of the approaches listed there?

tim-finnigan avatar Aug 16 '22 16:08 tim-finnigan

Hi @tim-finnigan, thanks for the articles. Good to know it's a common problem.

I did manage to fix it stable enough for production (Wrote a custom credential provider for AWS that does the retry, as well as forced the credential provider to use us-east-1 for STS since Github actions runs in the US), but my point is that I shouldn't have to.

yannickvr avatar Aug 18 '22 09:08 yannickvr

Hi @yannickvr thanks for following up. It looks like this issue overlaps with a few issues in this repository. Have you looked through any of those issues? https://github.com/aws-actions/configure-aws-credentials/issues/299 in particular looks related although that was closed due to a PR merged a few months go. That repository might be the better place to leave feedback as this relates to GitHub Actions.

tim-finnigan avatar Aug 23 '22 23:08 tim-finnigan

Happens in terraform aswell, i have to now write a custom script provider to be able to work

arielb135 avatar Oct 05 '22 17:10 arielb135

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

github-actions[bot] avatar Dec 28 '22 19:12 github-actions[bot]

Could this be re-opened? It is still an issue.

benjaminpottier avatar May 05 '23 13:05 benjaminpottier