pulumi-aws icon indicating copy to clipboard operation
pulumi-aws copied to clipboard

Occasionally AWS SSO credentials do not load, resulting in a deployment failure

Open AaronFriel opened this issue 2 years ago • 6 comments

In both v5 and v6 of the provider, I occasionally see an error like below, that the AWS SSO session file could not be opened. Retrying the pulumi command typically succeeds, and I can verify access by running aws sts get-caller-identity.

If I run stat ... on the file it fails to read, I see that it exists and was created around the time I ran aws sso login, several hours ago.

On v5.4.2:

  aws:ec2:SecurityGroup (db-security-group):
    error: Preview failed: 1 error occurred:
        * configuring Terraform AWS Provider: no valid credential sources for Terraform AWS Provider found.
    
    Please see https://registry.terraform.io/providers/hashicorp/aws
    for more information about providing credentials.
    
    AWS Error: failed to refresh cached credentials, the SSO session has expired or is invalid: failed to read cached SSO token file, open /home/friel/.aws/sso/cache/[redacted].json: input/output error

On v6.9.0:

  aws:ec2:RouteTable (eks-ai-fleet-sponge-public-2):
    error: Preview failed: unable to validate AWS credentials.
    Details: no valid credential sources for Pulumi AWS Classic found.
    
    Please see https://www.pulumi.com/registry/packages/aws/installation-configuration/
    for more information about providing credentials.
    
    AWS Error: failed to refresh cached credentials, the SSO session has expired or is invalid: failed to read cached SSO token file, open /home/friel/.aws/sso/cache/[redacted].json: input/output error
    
    Make sure you have set your AWS region, e.g. `pulumi config set aws:region us-west-2`.

OS: Linux on Windows Subsystem for Linux

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

AaronFriel avatar Nov 17 '23 20:11 AaronFriel

From this slack thread:

I was getting this error. It happens from time to time. This is the first time I've seen it since I started using Pulumi again at a new job, a few weeks ago. Prior to that at a previous workplace, I'd see the issue periodically, perhaps every couple of months. It's a little hard to track down. It's hard to say for sure that it wasn't network related, but what I do remember for sure at the time is usually aws sts get-caller-identity works in a shell while Pulumi is not working. Sometimes, it was fixed by using a different shell (opening up a new terminal) but not always.

error: unable to validate AWS credentials.
    Details: error validating provider credentials: error calling sts:GetCallerIdentity: operation error STS: GetCallerIdentity, exceeded maximum number of attempts, 3, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts.ap-southeast-1.amazonaws.com/": dial tcp: lookup sts.ap-southeast-1.amazonaws.com on 1.1.1.1:53: read udp 192.168.65.21:49796->1.1.1.1:53: i/o timeout
    
    Make sure you have:
    
         • Set your AWS region, e.g. `pulumi config set aws:region us-west-2`
         • Configured your AWS credentials as per https://pulumi.io/install/aws.html
         You can also set these via cli using `aws configure`.

My aws credentials are properly configured and working in the shell using aws sts get-caller-identity --profile MYPROFILE

jamest-pin avatar Nov 22 '23 03:11 jamest-pin

I frequently get this error on MacOS

configuring Terraform AWS Provider: validating provider credentials: retrieving caller identity from STS: operation error STS: GetCallerIdentity, https response error StatusCode: 0, RequestID: , request send failed, Post "https://sts.us-east-1.amazonaws.com/": dial tcp: lookup sts.us-east-1.amazonaws.com on 192.168.0.1:53: no such host

Likely a related if not the same issue.

AWS CLI works fine when this issue occurs.

Only reliable work-around is to execute Pulumi in a docker container shell and pass in the Pulumi and AWS environment variables.

infinitro-dev1 avatar Nov 27 '23 20:11 infinitro-dev1

There was an old thread about Pulumi and Terraform being incompatible with aws-vault and with temporary SSO credentials. I don't feel that is related to this, but FWIW I do utilize aws-vault. However the problem persists with classic credentials.

It would be worth a confirmation that the Pulumi/Terraform aws-vault incompatibility still exists or has been resolved. The prior thread seemed to indicate it was a "won't fix" issue, but I've never seen any disclaimers against using aws-vault with Pulumi.

I am utilizing Pulumi AWS Classic and Pulumi AWS Crosswalk, the latter seemed to have an issue with non-classic credentials, but again the problem above persists with classic AWS credentials.

infinitro-dev1 avatar Nov 27 '23 20:11 infinitro-dev1

I've come across something like this as well when trying to log in to a pulumi state S3 bucket. Retrying didn't fix it either. My workaround was to find the file for the SSO profile and and symlink it to the file that is missing. E.g:

❯ export AWS_PROFILE="some-profile"
❯ pulumi login s3://<redacted>
error: problem logging in: read ".pulumi/meta.yaml": blob (key ".pulumi/meta.yaml") (code=Unknown): SSOProviderInvalidToken: the SSO session has expired or is invalid
caused by: open /home/tor/.aws/sso/cache/64c617cc9ffe5acce72ea3f39172622410ec899f.json: no such file or directory

# AWS CLI works fine though:
❯ aws s3 ls s3://<redacted>
                           PRE .pulumi/

❯ ln -s ~/.aws/sso/cache/e880436e045d29884fc18887312993ade8cbffe1.json ~/.aws/sso/cache/64c617cc9ffe5acce72ea3f39172622410ec899f.json
❯ pulumi login s3://<redacted>
Logged in to torrot-thinkpad as tor (s3://<redacted>)

torrottum avatar Nov 30 '23 09:11 torrottum

I believe this is caused by https://github.com/aws/aws-sdk-go-v2/issues/2241

I think the steps to fix are:

  1. make sure the AWS CLI is up to date
  2. verify that you don't have a # in your SSO start url
  3. reauth, watch the problem disappear

jaxxstorm avatar Dec 01 '23 21:12 jaxxstorm

@jaxxstorm Thanks, that fixed it!

torrottum avatar Dec 04 '23 11:12 torrottum

I was not able to repro but closing out based on the excellent RCA above, https://github.com/aws/aws-sdk-go-v2/issues/2241 is fixed and included in the latest pulumi-aws provider. Please let us know if you are still hitting this issue.

t0yv0 avatar Jun 13 '24 20:06 t0yv0