azure-sdk-for-js icon indicating copy to clipboard operation
azure-sdk-for-js copied to clipboard

Managed Identity - DefaultAzureCredential - RestError: [TokenExpired] Error validating token: 'S2S12086'

Open Delapouite opened this issue 3 years ago • 8 comments

  • Package Name:
  • @azure/identity
  • Package Version:
  • ^2.0.0
  • Operating system:
  • Linux
  • [x] nodejs
    • version: 16.x
  • [ ] browser
    • name/version:
  • [ ] typescript
    • version:
  • Is the bug related to documentation in
    • [ ] README.md
    • [ ] source code documentation
    • [ ] SDK API docs on https://docs.microsoft.com

Describe the bug

Hello

I have a Node.js application running in App Service that needs to fetch secrets from a KeyVault. This App Service has a Managed Identity. Here's the relevant pseudo code :

const secretClient = new SecretClient(this.keyVaultUrl, new DefaultAzureCredential());
const secret = (await secretClient.getSecret(secretName))?.value;

This has been working great for months, but suddently about 10 days ago, our app wasn't able to fetch secrets anymore. A first basic look at the logs, showed this error message, coming from the underlying library:

RestError: [TokenExpired] Error validating token: 'S2S12086'

We're using DefaultAzureCredential, which in our case instantiate a ManagedIdentityCredential in the end. So it looks like that there's an issue with the token expiration date received by the App Service by the internal Azure issuer.

To better understand what's happening under the hood, I enabled the maximum verbosity with AZURE_LOG_LEVEL=verbose environment variable.

Then, one particular line looks quite suspicious at the end of the token acquisition dance :

azure:identity:info IdentityClient: [http://172.16.2.7:8081/msi/token?resource=https%3A%2F%2Fvault.azure.net&api-version=2019-08-01] token acquired, expires on NaN

It corresponds to the following line : https://github.com/Azure/azure-sdk-for-js/blob/main/sdk/identity/identity/src/client/identityClient.ts#L140

The NaN at the end of the log line (which is token.accessToken.expiresOnTimestamp) looks quite surprising and I think it may be related to the problem we experiencing.

Do you confirm this intuition? Thanks.

Delapouite avatar Jul 27 '22 12:07 Delapouite

This seems a little strange, do you notice the same issue using the latest version of @azure/identity 2.1.0?

You could also cross-verify the expiration timestamp by running this snippet to make sure it's not some error in string conversion -

const credential = new DefaultAzureCredential();
const accessTokenObject = credential.getToken("https://vault.azure.net/.default");
console.log(accessTokenObject.expiresOnTimestamp);

Also this structure - token.accessToken.expiresOnTimestamp looks incorrect. The accessToken object structure returned by the Identity Credentials is like this -

{
"token" : "<JWT-TOKEN>",
"expiresOnTimestamp" : <timestamp number>
}

KarishmaGhiya avatar Jul 29 '22 22:07 KarishmaGhiya

Hi.

After a bit more investigation, the problem appeared because our dependency on @azure/identity was a bit loose semver-wise. Meaning that everything was fine with version 2.0.5 and problem started to appear when 2.1.0 was released.

My understanding after looking at the log message is that the library is working well with a token returned from an endpoint using the 2017 API but something wrong is happening as soon as it starts interacting with an endpoint the 2019 API.

Delapouite avatar Aug 01 '22 06:08 Delapouite

I'm experiencing this problem and it appears that it started after upgrading from 2.0.5 -> 2.1.0. I'm yet to downgrade my package to verify the problem, but I will attempt to do that and report back.

dhensby avatar Aug 02 '22 09:08 dhensby

Hi.

After a bit more investigation, the problem appeared because our dependency on @azure/identity was a bit loose semver-wise. Meaning that everything was fine with version 2.0.5 and problem started to appear when 2.1.0 was released.

My understanding after looking at the log message is that the library is working well with a token returned from an endpoint using the 2017 API but something wrong is happening as soon as it starts interacting with an endpoint the 2019 API.

Can you enable the logs and provide it so that I understand what the error message is when it starts interacting with 2019 API? https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/identity/identity#logging

KarishmaGhiya avatar Aug 04 '22 17:08 KarishmaGhiya

Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

msftbot[bot] avatar Aug 16 '22 20:08 msftbot[bot]

This is still an issue

dhensby avatar Aug 16 '22 20:08 dhensby

@dhensby @Delapouite Can you provide us with more logs so that we understand what the error message is when it starts interacting with 2019 API? https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/identity/identity#logging

KarishmaGhiya avatar Aug 16 '22 22:08 KarishmaGhiya

I don't have the capacity at the moment to run this in any of my environments with the bad release. I can confirm the problem is not present in 2.0.5 and it is present in 2.1.0.

Sorry I can't be of more help in the immediate term on this.

dhensby avatar Aug 16 '22 22:08 dhensby

Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

msftbot[bot] avatar Aug 24 '22 02:08 msftbot[bot]

@msftbot - what info are you waiting on?

dhensby avatar Aug 24 '22 12:08 dhensby

I ran a test over the weekend. Just a simple timer triggered Function that writes to Cosmos DB. I had three versions of this Function App: "@azure/identity": "^2.0.5", "@azure/identity": "^2.1.0", and "@azure/identity": "^3.0.0-beta.1"

const { DefaultAzureCredential } = require("@azure/identity");
const { CosmosClient } = require("@azure/cosmos")

const client = new CosmosClient({
    endpoint: process.env["CosmosDb__accountEndpoint"],
    aadCredentials: new DefaultAzureCredential(),
  });
const database = client.database("a")
const container = database.container("a")

module.exports = async function (context, myTimer) {
    var timeStamp = new Date().toISOString();

    const id = "1";
    const partitionKey = "205";
    const res = await container.items.upsert({
                    id,
                    partitionKey,
                    timeStamp});
};

v2.1.0 as well as v3.0.0-beta.1 failed after ~24h. Not so v2.0.5. It still runs fine after more than 48h.

mfehlhaber avatar Aug 29 '22 07:08 mfehlhaber

Just lost 2 days troubleshooting this 😔

As reported by multiple people above, on version 2.1.0 sometimes the expiresOnTimestamp is NaN, which caused our internal refresh logic to fail and never refresh it, so after 24h the token became invalid and we were still trying to use it.

For context, we're using App Service with user-assigned managed identity.

goenning avatar Aug 30 '22 15:08 goenning

I have the same problem. We use App Service with Managed Identities.

Enabling logs using AZURE_LOG_LEVEL=verbose, I notice that the token is considered expiring on "NaN". As well, I can see in logs that the AppServiceMSI 2019. In appServiceMsiApp2019.ts, I think there's something wrong here: return Date.parse(requestBody.expires_on! as string); because "expires_on" (got from response payload) is an stringified epoch, this leads to NaN. so something like this could be more correct: return new Date(requestBody.expires_on);

BenLegra avatar Sep 06 '22 14:09 BenLegra

Hi @KarishmaGhiya, I confirm that this line in appServiceMsiApp2019.ts is incorrect (here) : return Date.parse(requestBody.expires_on! as string); should be : return parseInt(requestBody.expires_on!) * 1000;

BenLegra avatar Sep 08 '22 12:09 BenLegra

Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

msftbot[bot] avatar Sep 15 '22 20:09 msftbot[bot]

Looks that in 2019-08-01 version the expires_on has been changed to a string of UNIX epoch seconds. https://github.com/Azure/azure-sdk-for-python/issues/11346

jeremymeng avatar Sep 15 '22 21:09 jeremymeng

Hi all, thank you very much for your patience on this fix. I have a PR (#23232) that I believe will resolve this issue, though I need to do manual testing to make sure it works in real environments.

xirzec avatar Sep 15 '22 22:09 xirzec

Hi again patient friends, I would like to update this thread with the news that my fix was released as part of @azure/identity 3.0.0 yesterday. 🎉

I believe this should hopefully resolve the issue, so in my optimism I am going to resolve but please let me know if you are still having issues with token expiration!

xirzec avatar Sep 20 '22 19:09 xirzec

I had that error when cached object client = new SecretClient and reused for a long time.

KEMBL avatar Dec 02 '22 14:12 KEMBL

@KEMBL Are you still noticing issues with token expiration?

KarishmaGhiya avatar Dec 02 '22 20:12 KarishmaGhiya