azure-cli
azure-cli copied to clipboard
Application Gateway - CanceledAndSupersededDueToAnotherOperation issue
We have an Application Gateway that we're using with several teams. Obviously, we have setup several CI/CD pipelines (with using the AZ CLI) that deploy our resources and creates our entrypoint for the Application Gateway.
We have started to notice that when we run several commands against the gateway (at the same time), we get the following error:
ERROR: (Canceled) Operation was canceled.
Code: Canceled
Message: Operation was canceled.
Exception Details: (CanceledAndSupersededDueToAnotherOperation) Operation PutApplicationGatewayOperation (6cc97f54-b83d-46cd-83eb-be3b71a87b39) was canceled and superseded by operation PutApplicationGatewayOperation (f1138d5e-3e2c-4592-b178-b9655c5e1033).
Code: CanceledAndSupersededDueToAnotherOperation
Message: Operation PutApplicationGatewayOperation (6cc97f54-b83d-46cd-83eb-be3b71a87b39) was canceled and superseded by operation PutApplicationGatewayOperation (f1138d5e-3e2c-4592-b178-b9655c5e1033).
Returncode: 1
We've tested this and it seems to be that indeed, the first request gets cancelled, which ends in a failed pipeline and no entrypoint created for the Application Gateway. This keeps us from being able to run our pipelines at the same time, since it often ends in an error. We've been running these pipelines for the last few years, but haven't noticed this behaviour before.
The az cli commands that we're using are all part of the az network application-gateway
endpoints, for example:
-
az network application-gateway address-pool create
-
az network application-gateway probe create
I would expect that I would be able to add/update several listeners/backend pools/etc for the gateway at the same time, without the first operation being cancelled.
So the question we have is:
Did something change with these calls? Is this the expected behaviour? And could someone explain why this is/should be the case?
network
I faced the same issue, in my case, I am using Terraform with the null_resource
:
resource "null_resource" "external_appgw_assign_identity" {
count = var.kv_certificates_integration ? 1 : 0
# A new external application gateway requires re-provisioning
triggers = {
external_application_gatewy_id = azurerm_application_gateway.external_appgw.id
}
provisioner "local-exec" {
command = "az network application-gateway identity assign --gateway-name ${local.external_gateway_name} --resource-group ${local.external_gateway_rg_name} --identity ${local.external_agic_identity_id} --subscription ${data.azurerm_subscription.primary.subscription_id}"
}
}
But is odd because I have deployed my Terraform module multiple times and today is the first time that I face that issue...
Then I deployed from scratch an AKS Cluster with my Terraform module and the deployment was completed without errors.
Should I insert some kind of delay (like time_sleep) to avoid the issue?
Thank you for your feedback. This has been routed to the support team for assistance.
It seems that service side has some adjustments in handling concurrent operations against application gateway. Thanks.
@necusjz Thank you for your comment.
Please advise how to handle these changes. I cannot find a specific locking mechanism to lock the Application Gateway when another operation is being handled (as for example, creating a listener).
Hope to hear from you!
@LotteVoorhorst Let's wait for the feedback from service side. : )
From CLI side, we can append --no-wait
argument, like:
az network application-gateway probe create -n {} -g {} --gateway-name {} --path {} --protocol {} --host {} --no-wait
@necusjz Can you confirm that this workaround helps?
@necusjz Can you confirm that this workaround helps?
@ddran-averydennison FYI: I have simulated the scenario of creating probes concurrently on my local machine and it worked.
Hi, we're sending this friendly reminder because we haven't heard back from you in a while. We need more information about this issue to help address it. Please be sure to give us your input within the next 7 days. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!
I've tried the --no-wait option, but unfortunately I'm getting the next error when concurrently creating domain entrypoints:
Invoke-Executable az network application-gateway rule create --gateway-name {gateway-name} --name {name} --resource-group {resourcegroup} --http-listener {name} --rule-type Basic --redirect-config {redirect-config} --no-wait
ERROR: (RetryableError) A retryable error occurred. Code: RetryableError Message: A retryable error occurred. Exception Details: (RetryableErrorDueToAnotherOperation) Operation PutApplicationGatewayOperation (8ce638ea-c314-4d26-a304-d9903d49e5cd) is updating resource {resource-id}. The call can be retried in 12 seconds. Code: RetryableErrorDueToAnotherOperation Message: Operation PutApplicationGatewayOperation (8ce638ea-c314-4d26-a304-d9903d49e5cd) is updating resource {resource-id}. The call can be retried in 12 seconds. Returncode: 1
It's a different error, but still the same problem. This should not happen when using the 'no-wait' option, correct?
Please advise how to handle.
@HemantErappa , it seems that service side has some adjustments in handling concurrent operations against application gateway. Could you give us a hand? Thanks!
Any updates on this issue?
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @aznetsuppgithub.
Issue Details
We have an Application Gateway that we're using with several teams. Obviously, we have setup several CI/CD pipelines (with using the AZ CLI) that deploy our resources and creates our entrypoint for the Application Gateway.
We have started to notice that when we run several commands against the gateway (at the same time), we get the following error:
ERROR: (Canceled) Operation was canceled.
Code: Canceled
Message: Operation was canceled.
Exception Details: (CanceledAndSupersededDueToAnotherOperation) Operation PutApplicationGatewayOperation (6cc97f54-b83d-46cd-83eb-be3b71a87b39) was canceled and superseded by operation PutApplicationGatewayOperation (f1138d5e-3e2c-4592-b178-b9655c5e1033).
Code: CanceledAndSupersededDueToAnotherOperation
Message: Operation PutApplicationGatewayOperation (6cc97f54-b83d-46cd-83eb-be3b71a87b39) was canceled and superseded by operation PutApplicationGatewayOperation (f1138d5e-3e2c-4592-b178-b9655c5e1033).
Returncode: 1
We've tested this and it seems to be that indeed, the first request gets cancelled, which ends in a failed pipeline and no entrypoint created for the Application Gateway. This keeps us from being able to run our pipelines at the same time, since it often ends in an error. We've been running these pipelines for the last few years, but haven't noticed this behaviour before.
The az cli commands that we're using are all part of the az network application-gateway
endpoints, for example:
-
az network application-gateway address-pool create
-
az network application-gateway probe create
I would expect that I would be able to add/update several listeners/backend pools/etc for the gateway at the same time, without the first operation being cancelled.
So the question we have is:
Did something change with these calls? Is this the expected behaviour? And could someone explain why this is/should be the case?
Author: | LotteVoorhorst |
---|---|
Assignees: | necusjz, kairu-ms |
Labels: |
|
Milestone: | Backlog |
@yonzhan @necusjz @HemantErappa - Do you have any update on when this may be looked into on the service side at all?
I am also receiving the same issue, has anyone been able to resolve this?
Updates:
I have contacted with the service team. "CanceledAndSupersededDueToAnotherOperation" means if customer performed multiple PUT operations in quick succession on the same appgw resource, then the previous one gets cancelled and superseded by the new one.
So, it's expected and by design. We can add some time sleeps to wait for previous PUT calls to complete as a workaround.
I just came across this. I still believe there is an issue. I am trying to create an application gateway using the cli and it runs for about 2 minutes and then gets cancelled. I am running it manually and there are no other operations being sent. When looking at the deployment details in the resource group I find I get the same error posted above. This was a single create request that generated this error.
"statusMessage": "{\"status\":\"Canceled\",\"error\":{\"code\":\"ResourceDeploymentFailure\",\"message\":\"The resource operation completed with terminal provisioning state 'Canceled'.\",\"details\":[{\"code\":\"Canceled\",\"message\":\"Operation was canceled.\",\"details\":[{\"code\":\"CanceledAndSupersededDueToAnotherOperation\",\"message\":\"Operation PutApplicationGatewayOperation (c274d748-3495-4a91-b6d7-0b9d76fc4102) was canceled and superseded by operation PatchApplicationGatewayOperation (236ea644-bb1c-43b8-b915-cd6beb13a31b).\"}]}]}}",
I'm also still experiencing this, we're using terraform and not doing anything fancy at all. It appears to be happening during a destroy operation. Blocks the destroy.
Why are these operation not just queued? Seems like the logical way to handle it rather than cancelling.