azure-sdk-for-net icon indicating copy to clipboard operation
azure-sdk-for-net copied to clipboard

[BUG] CosmosDBAccountResource.RegenerateKeyAsync throws exception on 202 status code

Open baryoloraul opened this issue 1 year ago • 1 comments
trafficstars

Library name and version

Azure.ResourceManager.CosmosDB 1.3.2

Describe the bug

Calling CosmosDBAccountResource.RegenerateKeyAsync generates a response with a http status code of 202, which results in a RequestFailedException being thrown. From the Azure portal, it appears that this operation succeeds, but the SDK just incorrectly throws an exception.

2024-05-13 18:21:40.199493: Cosmos DB Account RegenerateKeyAsync exception (Account: REDACTED, Key Type: secondaryReadonly). ExceptionType: RequestFailedException, ExceptionMessage: Service request failed. 2024-05-13 18:21:40.199825: Status: 202 (Accepted) 2024-05-13 18:21:40.200032: 2024-05-13 18:21:40.200114: Content: 2024-05-13 18:21:40.200404: {"status":"Dequeued"} 2024-05-13 18:21:40.200604: 2024-05-13 18:21:40.200678: Headers: 2024-05-13 18:21:40.200852: Cache-Control: no-store, no-cache 2024-05-13 18:21:40.201027: Pragma: no-cache 2024-05-13 18:21:40.201258: Location: REDACTED 2024-05-13 18:21:40.201346: x-ms-request-id: a7d25542-9579-44be-882b-4fc4a265c5be 2024-05-13 18:21:40.201571: Strict-Transport-Security: REDACTED 2024-05-13 18:21:40.201653: x-ms-gatewayversion: REDACTED 2024-05-13 18:21:40.201835: x-ms-ratelimit-remaining-subscription-reads: REDACTED 2024-05-13 18:21:40.201910: x-ms-correlation-request-id: REDACTED 2024-05-13 18:21:40.202407: x-ms-routing-request-id: REDACTED 2024-05-13 18:21:40.202423: X-Content-Type-Options: REDACTED 2024-05-13 18:21:40.202428: X-Cache: REDACTED 2024-05-13 18:21:40.202432: X-MSEdge-Ref: REDACTED 2024-05-13 18:21:40.202435: Date: Mon, 13 May 2024 18:21:39 GMT 2024-05-13 18:21:40.202439: Content-Length: 21 2024-05-13 18:21:40.202442: Content-Type: application/json

Expected behavior

This should be treated as a long running operation and not result in an RequestFailedException.

Actual behavior

This API can result in either 200-OK or 202-Accepted as per the documentation, for cases where it results in 202-Accepted the SDK operation results in a RequestFailedException

Reproduction Steps

Just use the SDK to re generate account keys, if the operation is processed asynchronously, this error shows up.

Environment

No response

baryoloraul avatar May 16 '24 15:05 baryoloraul

Thank you for your feedback. Tagging and routing to the team member best able to assist.

jsquire avatar May 16 '24 15:05 jsquire

@baryoloraul Could you share the codes to invoke RegenerateKeyAsync() and the detailed error messages?

I cannot reproduce this error under Azure.ResourceManager.CosdmosDB version 1.3.2 and Azure.Identity version 1.12.0-beta.2. Here are my codes (REDACTED):

using AzureEventSourceListener consoleListener = AzureEventSourceListener.CreateConsoleLogger(EventLevel.Informational);
var client = new ArmClient(new InteractiveBrowserCredential(), "{your subscription id}", new ArmClientOptions()
{
    Diagnostics = { IsLoggingContentEnabled = true }
});

ResourceGroupResource rg = client.GetDefaultSubscription().GetResourceGroup("{your resource group}");
CosmosDBAccountResource account = rg.GetCosmosDBAccount("{your cosmos account}");
await account.RegenerateKeyAsync(WaitUntil.Completed, new CosmosDBAccountRegenerateKeyContent(CosmosDBAccountKeyKind.Secondary));

Here is some key traffic content:

[Informational] Azure-Core: Request [xxxxxxxx] POST https://management.azure.com/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.DocumentDB/databaseAccounts/xxxxxx/regenerateKey?api-version=2022-11-15
[Informational] Azure-Core: Response [xxxxxxx] 202 Accepted (01.6s)
......
[Informational] Azure-Core: Request [yyyyy] GET https://management.azure.com/subscriptions/xxx/providers/Microsoft.DocumentDB/locations/eastus2/operationsStatus/xxxxxxxxxxx?api-version=2022-11-15
[Informational] Azure-Core: Response [yyyyy] 200 Ok (00.6s)
......
[Informational] Azure-Core: Request [zzzzz] GET https://management.azure.com/subscriptions/xxx/resourceGroups/xxxx/providers/Microsoft.DocumentDB/databaseAccounts/xxxxxxxx/regenerateKey/operationResults/xxxxxxxxx?api-version=2022-11-15
[Informational] Azure-Core: Response [zzzzz] 200 Ok (00.6s)

Apparently, the polling is successful.

archerzz avatar May 20 '24 15:05 archerzz

Hi @baryoloraul. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] avatar May 22 '24 03:05 github-actions[bot]

Hi @baryoloraul, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

github-actions[bot] avatar May 29 '24 03:05 github-actions[bot]

Hi @archerzz, the error details in the bug description is all I have, next is the code I am using to rotate the keys:

var cosmosDbAccountId = CosmosDBAccountResource.CreateResourceIdentifier(subscriptionId, resourceGroupName, cosmosDbAccountName);
var cosmosDbAccount = armClient.GetCosmosDBAccountResource(cosmosDbAccountId);
var operationResult = await cosmosDbAccount.RegenerateKeyAsync(WaitUntil.Completed, new CosmosDBAccountRegenerateKeyContent(CosmosDBAccountKeyKind.SecondaryReadonly), cancellationToken).ConfigureAwait(false);

Worth mentioning that this error does not show up every time I try to rotate the keys, I haven't found a pattern though. Retrying the same failed operation succeeds some times.

I am running this in .Net 6 in CBL-Mariner 2.0.

baryoloraul avatar May 30 '24 14:05 baryoloraul

I compared our test recording and found there could be a status problem on backend.

  • See our test recording of regenerating keys: https://github.com/Azure/azure-sdk-assets/blob/6f2e7f0c2ec8320d22226026a9055ccc9f14749a/net/sdk/cosmosdb/Azure.ResourceManager.CosmosDB/tests/SessionRecords/DatabaseAccountTests/DatabaseAccountListKeysAndRegenerateKeysTest.json#L866 It returns "status": "Enqueued" for the 1st request.
  • But from the error diagnostics given in this issue, the initial response body is {"status":"Dequeued"}.
  • I checked the test recordings and found that normally the status should be Enqueued, then Dequeued, then Succeeded. So, I'm wondering if there is something wrong on backend. That could also explain why this issue is intermittent.

Hi cosmos team, Can you check x-ms-request-id: a7d25542-9579-44be-882b-4fc4a265c5be to see if there is something wrong on backend? Thanks.

archerzz avatar May 31 '24 11:05 archerzz

@baryoloraul Is it possible you enable diagnostics on your codes? like below:

// or other listener to collect logs
using AzureEventSourceListener consoleListener = AzureEventSourceListener.CreateConsoleLogger(EventLevel.Informational);

var client = new ArmClient(new AzureDefaultCredential(), "xxxx", new ArmClientOptions()
{
    Diagnostics = { IsLoggingContentEnabled = true }
});

We cannot reproduce this error. And after analyzing existing error messages, there is no clue on what was wrong. Some detailed logs are greatly appreciated. Thanks.

archerzz avatar Jun 03 '24 07:06 archerzz

Hi @baryoloraul. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] avatar Jun 03 '24 07:06 github-actions[bot]

Hi @baryoloraul, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

github-actions[bot] avatar Jun 10 '24 09:06 github-actions[bot]

Hi @baryoloraul, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

github-actions[bot] avatar Jun 17 '24 15:06 github-actions[bot]