azbrowse icon indicating copy to clipboard operation
azbrowse copied to clipboard

Delete AKS cluster failed

Open serbrech opened this issue 4 years ago • 6 comments

I first thought we have a bug in AKS for old clusters, but I could delete it just fine in the portal or CLI.

I think you're just not including an api-version in the delete command for some reason.

{
  "error": {
    "code": "NoRegisteredProviderFound",
    "message": "No registered resource provider found for location 'westus2' and API version 'MISSING' for type 'managedClusters'. The supported api-versions are '2017-08-31, 2018-03-31, 2018-08-01-preview, 2019-02-01, 2019-04-01, 2019-06-01, 2019-08-01, 2019-10-01, 2019-11-01, 2020-01-01, 2020-02-01'. The supported locations are 'eastus, westeurope, francecentral, centralus, canadacentral, canadaeast, uksouth, westus, westus2, australiaeast, northeurope, japaneast, japanwest, eastus2, southcentralus, northcentralus, southeastasia, australiasoutheast, ukwest, southindia, centralindia, eastasia, koreasouth, koreacentral, southafricanorth, brazilsouth, germanynorth, switzerlandnorth, switzerlandwest, germanywestcentral, uaenorth, norwayeast, norwaywest'."
  }
}

serbrech avatar Mar 05 '20 18:03 serbrech

tried to figure out where I could track this issue, but I'll wait for you to see this and point me in the right direction. A drawback of generated codebase, it's hard to figure things out ;)

serbrech avatar Mar 05 '20 22:03 serbrech

Interesting! thanks for raising this

@stuartleeks I you have a good starter for 10 on this one?

lawrencegripper avatar Mar 06 '20 10:03 lawrencegripper

Flagging stale issue. Actions will close this issue in the next 5 days unless action is taken.

github-actions[bot] avatar Apr 05 '20 11:04 github-actions[bot]

Still need to look into this one. Things a bit hectic without childcare at the moment so behind on stuff.

lawrencegripper avatar Apr 12 '20 21:04 lawrencegripper

So looks like this isn't directly related to the auto-gen code. Have a working theory I'm looking at.

Currently the ARMClient uses the Providers API call to get back all the supported API versions. It then rather simplistically builds up a map of RP->Version or if it can't adds Missing as the API Version. My guess is something went wrong with populating the provider dictionary or a race occurred and the provider dictionary was still being populated while the delete was called.

https://github.com/lawrencegripper/azbrowse/blob/dd441349488cf7ba91452012781be5dccc8c9548/pkg/armclient/armclient.go#L207-L213

https://github.com/lawrencegripper/azbrowse/blob/dd441349488cf7ba91452012781be5dccc8c9548/pkg/armclient/armclient.go#L242-L252

Few things we could do better here:

  1. Better error handling when an API version is missing. Showing a status message, returning "MISSING" is the wrong thing to do here.
  2. Check that the API Version is available in the location of the resource. It's possible that an API version may be picked which isn't available in a region as the cache is built up simplistically.
  3. Check that population of the Provider dictionary has been completed before allowing calls.

lawrencegripper avatar Apr 14 '20 11:04 lawrencegripper

Had a quick test and with current version I can delete an aks cluster successfully. This likely points to the race condition with the provider call taking too long meaning the version lookup fails.

lawrencegripper avatar Apr 14 '20 11:04 lawrencegripper