esc
esc copied to clipboard
Pulumi refresh uses cached auth info
What happened?
When trying to run pulumi refresh
on a stack using an ESC env, I was getting this error:
error: Preview failed: 1 error occurred:
* Retrieving AWS account details: validating provider credentials: retrieving caller identity from STS: operation error STS: GetCallerIdentity, https response error StatusCode: 403, RequestID: 60765ba4-5206-41f2-b209-6b3d381c0f5d, api error ExpiredToken: The security token included in the request is expired
*
This was confusing because esc run <env> -- aws sts get-caller-identity
worked fine. Eventually I tried running pulumi up
and the error went away, leading me to believe it busted some cache of an auth token.
A similar thing happened to me previously, not with expired creds but with changing my AWS configuration. I switched from having a hardcoded aws:profile
in the config to using an ESC env, and the refresh wouldn't succeed until I ran pulumi up
.
Example
I believe this should repro it but haven't tried myself
- create a pulumi stack using an OIDC-based ESC environment with the min duration (15m)
- run
pulumi up
- wait 16m
- try running
pulumi refresh
Output of pulumi about
CLI
Version 3.95.0
Go Version go1.21.4
Go Compiler gc
Plugins
NAME VERSION
aws 6.13.2
nodejs unknown
Host
OS darwin
Version 14.1
Arch arm64
This project is written in nodejs: executable='/Users/redacted/.nvm/versions/node/v17.0.0/bin/node' version='v17.0.0'
Backend
...
Token type personal
Dependencies:
NAME VERSION
@pulumi/aws 6.13.2
@pulumi/pulumi 3.94.2
@types/node 16.18.62
typescript 4.9.5
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
A similar thing happened to me previously, not with expired creds but with changing my AWS configuration. I switched from having a hardcoded aws:profile in the config to using an ESC env, and the refresh wouldn't succeed until I ran pulumi up.
Yeah it sounds like something similar is happening here. Is your program explicitly passing credentials to the AWS provider?
In the expired STS token case, no. In the aws:profile case, also no, but IIRC in that case the error was with a manually constructed kubernetes provider - I wasn't passing the AWS creds in manually, but I did see the AWS_PROFILE getting removed from some blob in the k8s provider when I ran pulumi up. I think they could be separate issues
In the expired STS token case, no
Interesting. Typically what we've seen in scenarios like this is encrypted credentials ending up in the statefile that then get reused by pulumi refresh
. The pulumi up
unblocks this b/c it ends up fetching new creds, which then get stored in the statefile and picked up by the next pulumi refresh
. Can you still repro this? If so, would you be able to share a statefile?
A little more context here, this issue is caused by the fact that pulumi refresh
and pulumi destroy
don't run the pulumi program, they use credentials that are stored in the state file if they exist. The current workaround is to not store your credentials in state. Practically, this means using authentication via environment variables instead of configuration values where possible.
There's an open issue to document this for GCP - https://github.com/pulumi/pulumi-gcp/issues/1815
@komalali in our case, this is an AWS ECR that's having this issue, and the env doesn't have exported env vars:
values:
login:
fn::open::aws-login:
oidc:
roleArn: <role>
sessionName: pulumi-environments-session
duration: "1h"
region: us-east-1
pulumiConfig:
aws:region: ${region}
aws:accessKey: ${login.accessKeyId}
aws:secretKey: ${login.secretAccessKey}
aws:token: ${login.sessionToken}
Does your comment suggest there's a bug in the pulumi-aws repo as well (in that it's caching the STS token in state?)
Entries in the Pulumi config get stored in state.
aws:accessKey: ${login.accessKeyId}
aws:secretKey: ${login.secretAccessKey}
aws:token: ${login.sessionToken}
These should be environmental variables.
Environment variables are not a sufficient solution either, at least not for AWS. It will work for a single instance of a provider, but if you need to use more than one account, even if you provide credentials explicitly to the second provider, it will still get confused (trying to assume role, etc). Perhaps this is a bug in the AWS provider, however.
This problem has deep roots in the programming model, and I want to offer some context for the behavior.
There are three kinds of Pulumi operations:
-
pulumi up
-
pulumi refresh
-
pulumi destroy
The first operation is distinctly different from the latter two in that it involves running the Pulumi program associated with the stack's project. As it runs, the Pulumi program defines the desired state for resources--including provider resources--using values computed by the program in coordination with the Pulumi engine. When the program creates a provider resource, the inputs for the provider are either sourced from the program itself (i.e. from values provided by the program) or are read out-of-band by the provider plugin. The exact set of configuration that may be sourced from the environment is particular to each provider--for example, the Kubernetes provider uses the ambient kubeconfig
by default, the AWS provider reads various AWS-specific environment variables, etc. Any explicitly-provided inputs are written into the stack's statefile.
For example, consider the following program:
import * as aws from "@pulumi/aws";
const usEast1 = new aws.Provider("us-east-1", { region: "us-east-1" });
const defaultRegion = new aws.Provider("default-region");
The usEast1
provider's region
is explicitly specified by the program, but the defaultRegion
provider's region
will be read from the environment (e.g. from the AWS_REGION
environment variable). In the resulting statefile, the usEast1
provider's state will include the region
input, but the defaultRegion
provider's state will not.
Because pulumi refresh
and pulumi destroy
do not run the Pulumi program associated with the stack's project, they are unable to recompute configuration values that were explicitly provided by the program, and must use the values stored in the statefile. Unfortunately, this may include credential information, which is what causes the errors described here. The current workaround--which is certainly not sufficient for explicitly-instantiated providers--is to use environment variables to provide credentials out-of-band.
The clearest/most complete solution here is to run the Pulumi program associated with a stack's project as part of pulumi refresh
and pulumi destroy
. Unfortunately, this is a major behavioral change, and the exact semantics of the run are not clear.
Closing this as a duplicate of https://github.com/pulumi/pulumi/issues/4981. We'll use that issue to track further progress on workarounds and solutions for the core problem.
For anyone looking, here is an example of an environment based on ENV variables:
values:
login:
fn::open::aws-login:
oidc:
duration: 1h
roleArn: <role>
sessionName: pulumi-environments-session
region: us-west-2
environmentVariables:
AWS_ACCESS_KEY_ID: ${login.accessKeyId}
AWS_SECRET_ACCESS_KEY: ${login.secretAccessKey}
AWS_SESSION_TOKEN: ${login.sessionToken}
pulumiConfig:
aws:region: ${region}