configure-aws-credentials
configure-aws-credentials copied to clipboard
Error object is empty breaking action
Action stopped working today, after working for multiple weeks:
runs-on: ubuntu-20.04
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
role-to-assume: arn:aws:iam::[REDACTED]:role/[REDACTED]
role-duration-seconds: 3600
aws-region: us-east-1
Error: Error message: Cannot read property 'message' of undefined
Seems to be kinda temperamental. Is working again now.
I've also seen this happen a couple of times today but not consistently. A search (https://github.com/aws-actions/configure-aws-credentials/search?q=message) indicates that this could be an error masking another error?
Edit: Also seeing this when assuming a role via OIDC.
Same problem here... do you have any ideas where to dig into the source.. oder what to do to make it more stable or get closer to the cause of the problem?
We're observing same flaky behaviour as of today (assuming role via OIDC). Currently we can't locate the problem and think that it's belonging to the runners - we're using default runners. One run fails, the next one again works. Very strange.
Wild guessing here, but I noticed that while adding Github as OIDC provider in AWS, that I get different SSL Certificate thumbprints, such as:
-
6938fd4d98bab03faadb97b34396831e3780aea1
(seems like the most current one, according to github's blog post and the example in the configure-aws-credentials action's README) -
15e29108718111e59b3dad31954647e3c344a231
(aws calculated that thumbprint when I created the github oidc provider on 22.02.22) -
a031c46782e6e6c662c2c87c76da9aa62ccabd8e
(seems to be an older one)
There is also an open issue here https://github.com/aws-actions/configure-aws-credentials/issues/357 (but the error message compared to that issue is more on point)
TLDR: You might should take a look at your thumbprint and add 693... to the list, if its not already there.
The issue described here must be intermittent. message
is a property we reference on the error objects when trying various things while running this action. So these objects are sometimes empty for some reason. Doesn't seem to be affecting many people consistently, if you do run into this in the future please comment on how you are trying to assume the role (e.g. OIDC or access keys)
We are also having this flaky behaviour while assuming roles with OIDC
Experiencing flaky behaviour assuming role via OIDC.
Using thumbprint 6938fd4d98bab03faadb97b34396831e3780aea1
I'm going to also request that people share their full workflow files if possible. It seems this is limited to OIDC which is helpful to know 🙂
We are having a similar experience, using aws-actions/configure-aws-credentials@v1-node16
:
Error: Error message: Cannot read properties of undefined (reading 'message')
Cleaned log:
Run aws-actions/configure-aws-credentials@v1-node16
with:
role-to-assume: arn:aws:iam::***:role/some/path/some-role-1MSY53MY098H2
aws-region: us-east-1
role-duration-seconds: 900
audience: sts.amazonaws.com
env:
AWS_DEFAULT_REGION: us-east-1
AWS_REGION: us-east-1
AWS_ACCESS_KEY_ID: ***
AWS_SECRET_ACCESS_KEY: ***
AWS_SESSION_TOKEN: ***
... other env vars
We are also seeing it using oidc
We're also getting intermittent failures with OIDC. It's working the majority of the time though. I don't see any logs in AWS CloudTrail for an attempted OIDC login around the time of the error.
Workflow:
- name: Configure AWS Credentials
uses: aws-actions/[email protected]
with:
role-to-assume: // our IAM role ARN
aws-region: // our AWS region
Output:
Run aws-actions/[email protected]
with:
role-to-assume: // our IAM role ARN
aws-region: // AWS region
audience: sts.amazonaws.com
env:
...
Error: Error message: Cannot read property 'message' of undefined
I also encountered this today in a workflow that has many parallel jobs that independently authed with AWS successfully - just the one job failed. I suspect that something is being raised in run()
that doesn't produce a typical Error
object.
I haven't any information on where this is being thrown from, but have enabled ACTIONS_STEP_DEBUG
in case I can catch it again.
In the meantime, perhaps the team can change the catch-all core.setFailed(error.message)
call to core.setFailed(error.toString())
so that the stack trace is emitted.. Alternatively, it could be called with an Error object, since actions/toolkit/core internally calls error.toString()
anyway.
Here's my unhelpful log :)
2023-02-13T19:43:12.0069544Z ##[group]Run <ORG-NAME>/<PRIVATE-CUSTOM-ACTION-FOR-AWS-AUTH>@v1
2023-02-13T19:43:12.0069850Z with:
2023-02-13T19:43:12.0070072Z role-to-assume: <ROLE>
2023-02-13T19:43:12.0070295Z account-name: <ACCOUNT-NAME>
2023-02-13T19:43:12.0070501Z aws-region: <ACCOUNT-REGION>
2023-02-13T19:43:12.0070722Z mask-aws-account-id: false
2023-02-13T19:43:12.0070934Z ##[endgroup]
2023-02-13T19:43:12.0301300Z ##[group]Run /home/runner/work/_actions/<ORG-NAME>/<PRIVATE-CUSTOM-ACTION-FOR-AWS-AUTH>/v1/configure.sh
2023-02-13T19:43:12.0301829Z [36;1m/home/runner/work/_actions/<ORG-NAME>/<PRIVATE-CUSTOM-ACTION-FOR-AWS-AUTH>/v1/configure.sh[0m
2023-02-13T19:43:12.0350726Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2023-02-13T19:43:12.0350996Z env:
2023-02-13T19:43:12.0351201Z ROLE: <ROLE>
2023-02-13T19:43:12.0351424Z ACCOUNT_NAME: <ACCOUNT-NAME>
2023-02-13T19:43:12.0351619Z ##[endgroup]
2023-02-13T19:43:12.0588288Z ##[group]Run aws-actions/configure-aws-credentials@v1-node16
2023-02-13T19:43:12.0588664Z with:
2023-02-13T19:43:12.0588951Z role-to-assume: arn:aws:iam::<ACCOUNT-ID>:role/<ROLE>
2023-02-13T19:43:12.0589401Z aws-region: <ACCOUNT-REGION>
2023-02-13T19:43:12.0589623Z mask-aws-account-id: false
2023-02-13T19:43:12.0589849Z audience: sts.amazonaws.com
2023-02-13T19:43:12.0590057Z ##[endgroup]
2023-02-13T19:43:12.2587488Z ##[error]Error message: Cannot read properties of undefined (reading 'message')
Hi, we are also experiencing this issue and enabled the SHOW_STACK_TRACE
but it only points out that the error is undefined somehow:
Error: Error message: Cannot read properties of undefined (reading 'message')
/home/runner/work/_actions/aws-actions/configure-aws-credentials/v2/dist/index.js:585
throw new Error(`Error message: ${error.message}`);
After examining the code it seems to be thrown in the getIDToken
within the OIDC client:
static getIDToken(audience) {
return __awaiter(this, void 0, void 0, function* () {
try {
// New ID Token is requested from action service
let id_token_url = OidcClient.getIDTokenUrl();
if (audience) {
const encodedAudience = encodeURIComponent(audience);
id_token_url = `${id_token_url}&audience=${encodedAudience}`;
}
core_1.debug(`ID token url is ${id_token_url}`);
const id_token = yield OidcClient.getCall(id_token_url);
core_1.setSecret(id_token);
return id_token;
}
catch (error) {
throw new Error(`Error message: ${error.message}`); <------ ERROR IS THROWN HERE
}
});
}
I've opened an issue on the Github toolkit where the OIDC client is defined
We have the same issue, with oidc and self hosted runners. Every time we restart the failed workflow, it works. Can it be that idle runners get into a broken state somehow (expired github_token ??) and the token exchange over oidc fails because of that?
Its been failing intermittently pretty much every day and it does seem to happen more frequently when there are more concurrent Github jobs.
Do you plan to re-priortise and assign someone to do more proper investigation ?
the workflow file is super simple:
... // other stuff.
- uses: aws-actions/configure-aws-credentials@v2
with:
aws-region: ${{ inputs.aws-region }}
role-to-assume: ${{ inputs.aws-oidc-role-arn }}
...
Its been failing intermittently pretty much every day and it does seem to happen more frequently when there are more concurrent Github jobs.
I can second that. For a few days it became more frequent now.
@lkeijmel can you share the issue here maybe?
GitHub has been having some intermittent issues the past couple days, any increased errors could be due to that.
We are nearing a v3 release, some stuff has to get sorted out first internally so I can't give any dates. I doubt this error would stop in v3 though, since we aren't changing the way we're using getIDToken()
.
I've never been able to reproduce it, so I haven't been able to really investigate it. Hopefully the issue @lkeijmel opened in the actions/core
repo may shed some light. Based on this thread and it's increased activity alongside some outages, I'd guess that it occurs as an intermittent GitHub issue, and getIDToken()
isn't handling that case properly. Thanks to @lkeijmel for sharing the investigation that it occurs, at least some of the time, when getIDToken()
is called.
@lmeynberg here's the issue https://github.com/actions/toolkit/issues/1441
For now I've forked this repo, added some additional logging in and around the getIDToken()
function and use it in our workflows and hopefully we can pinpoint the issue further. Yesterday the workflows didn't have any issues so it's waiting for the next incident
We've deploying without any issues so far this week, only a expired security token error sometimes but than was due to a longer run and a short TTL on the token itself. I just saw that github has written a blogpost regarding some OIDC issues which can be related to this: https://github.blog/changelog/2023-06-27-github-actions-update-on-oidc-integration-with-aws/
This went away with the thumbprint change, but started happening again today. @lkeijmel can you also see the issue on your side?
We don't see it currently on our deployments. After the blog post of the GH team we double checked the fingerprints but still no failures.
I don't think this issue is related to the fingerprints issue, this error would occur before it gets a chance to send the request to the IDP if it's coming from the getIDToken
call.
We've implemented the retry and backoff on the getIDToken
call in the next major version, we're working towards a release for that.
I just saw a blogpost from Github stating that pinning is no longer required and shouldn't be the issue.
We now implement a retry behavior when the getIDToken
call fails in v3
, so please let me know if upgrading to v3
helps with this at all.
I've subscribed to the issue @lkeijmel created, so I'll be up to date if there's any update there. Otherwise, I think the retry behavior on the token in v3
should patch this up
** Note ** Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.