terraform-provider-azurerm
terraform-provider-azurerm copied to clipboard
`orchestrator_version` for `azurerm_kubernetes_cluster_node_pool` is empty.
Is there an existing issue for this?
- [X] I have searched the existing issues
Community Note
- Please vote on this issue by adding a :thumbsup: reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Terraform Version
1.2.4
AzureRM Provider Version
3.16.0
Affected Resource(s)/Data Source(s)
azurerm_kubernetes_cluster_node_pool
Terraform Configuration Files
resource "azurerm_kubernetes_cluster" "aks-cluster" {
...
default_node_pool {
name = "ap${replace(lower(var.CLUSTER_NAME), "/[^a-z0-9]/", "")}"
node_count = var.NODE_COUNT
vm_size = var.NODE_SIZE
enable_auto_scaling = false
orchestrator_version = var.K8S_VERSION # <--- this is affected!
vnet_subnet_id = azurerm_subnet.subnet.id
tags = var.TAGS
}
...
}
resource "azurerm_kubernetes_cluster_node_pool" "additional_pool" {
for_each = var.ADDITIONAL_POOLS
...
name = each.value.name
vm_size = each.value.nodeSize
node_count = each.value.nodeCount
node_labels = each.value.nodeLabels
os_type = "Linux"
orchestrator_version = var.K8S_VERSION # <--- this is affected!
tags = var.TAGS
...
}
Debug Output/Panic Output
`terraform plan` shows 2 changes for the given configuration above.
Expected Behaviour
When i import a azurerm_kubernetes_cluster_node_pool
or azurerm_kubernetes_cluster
(default node pool is also affected) the orchestrator_version
should be set to the current used kubernetes version.
Actual Behaviour
When i import a azurerm_kubernetes_cluster_node_pool
or azurerm_kubernetes_cluster
(default node pool is also affected) the orchestrator_version
(kubernetes version) in the state file is empty.
Steps to Reproduce
- terraform import "..azurerm_kubernetes_cluster_node_pool.mypool" /ID..../
- Check
orchestrator_version
in the state file ->"orchestrator_version" = ""
Important Factoids
Germany
References
We set the orchestrator_version
of the node pools in our previous terraform configuration already. Since i upgraded the azurerm provider version to 3, terraform plans a change related to this attribute because it is empty in the state file. I tried to add the version number manually to the statefile, but the refresh
of terraform removes it, so that the plan keeps the same.
I updated azurerm-provider from 3.11.0 to 3.16.0 and terraform will add the orchestrator_version to the default_node_pool. A terraform show before displays the orchestrator_version and it's the same value as terraform will add. Strange...
Hitting the same issue, when upgrading to azurerm 3.15.1, the plan shows it wants to remove the value and then add it again (which is not allowed). I initially thought azure API might be returning null
for orchestrator_version
when the pool is spot but on the portal the JSON view still shows the correct value. The older azurerm 3.6.0 version still works, and going by the above comment 3.11.0 should also be ok.
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the
last "terraform apply" which may have affected this plan:
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spot"] has changed
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spot"
- orchestrator_version = "1.21.9" -> null
tags = {}
# (25 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotalt"] has changed
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotalt"
- orchestrator_version = "1.21.9" -> null
tags = {}
# (25 unchanged attributes hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotcompute"] has changed
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotcompute"
- orchestrator_version = "1.21.9" -> null
tags = {}
# (25 unchanged attributes hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotmemory"] has changed
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotmemory"
- orchestrator_version = "1.21.9" -> null
tags = {}
# (25 unchanged attributes hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.
─────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spot"] will be updated in-place
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spot"
+ orchestrator_version = "1.21.9"
tags = {}
# (25 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotalt"] will be updated in-place
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotalt"
+ orchestrator_version = "1.21.9"
tags = {}
# (25 unchanged attributes hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotcompute"] will be updated in-place
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotcompute"
+ orchestrator_version = "1.21.9"
tags = {}
# (25 unchanged attributes hidden)
}
# module.node-pools.azurerm_kubernetes_cluster_node_pool.spot["spotmemory"] will be updated in-place
~ resource "azurerm_kubernetes_cluster_node_pool" "spot" {
id = "<redacted>"
name = "spotmemory"
+ orchestrator_version = "1.21.9"
tags = {}
# (25 unchanged attributes hidden)
}
Plan: 0 to add, 4 to change, 0 to destroy.
Terraform v1.2.6
on linux_amd64
+ provider registry.terraform.io/hashicorp/azurerm v3.15.1
+ provider registry.terraform.io/hashicorp/random v3.3.2
+ provider registry.terraform.io/hashicorp/tls v4.0.1
Did some additional testing and the breaking change was done in azurerm 3.12.0, 3.11.0 is the last version that works correctly.
Was validating if azurerm 3.18 still had the issue and I can no longer reproduce the issue in any version. the only change since then was aks was upgraded to 1.23.8
.
I believe it's peculiarities of Azure API, not of the provider.
Troubleshooting
In https://github.com/hashicorp/terraform-provider-azurerm/pull/17084, I wanted to introduce support for version aliases (which let us omit patch version). Since they were supported only in a newer Azure API, I migrated the provider from 2022-01-02-preview
to 2022-03-02-preview
:
- In
2022-01-02-preview
, onlyorchestratorVersion
is returned. - In
2022-03-02-preview
, bothorchestratorVersion
andcurrentOrchestratorVersion
are present.
The fields are described in details here: https://docs.microsoft.com/en-us/rest/api/aks/managed-clusters/create-or-update?tabs=HTTP.
Basically, orchestratorVersion
matches the version you supply through an API call (can be x.y or x.y.z), whereas currentOrchestratorVersion
is always the actual version that is running in the cluster (x.y.z).
After I saw your issue and this one #17518 , I got curious what goes wrong here and made some tests with different provider versions with debug enabled.
When you run terraform commands against the same api version as was used during node pool creation, everything is fine.
When you create a node pool against 2022-01-02-preview
(provider < 3.12.0
), further GET calls would return this (irrelevant fields are omitted for brevity):
2022-08-25T18:35:35.220+0200 [DEBUG] provider.terraform-provider-azurerm_v3.11.0_x5: AzureRM Request:
GET /subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-01-02-preview HTTP/1.1
Host: management.azure.com
2022-08-25T18:35:35.643+0200 [DEBUG] provider.terraform-provider-azurerm_v3.11.0_x5: AzureRM Response for https://management.azure.com/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-01-02-preview:
{
"orchestratorVersion": "1.23.5",
}: timestamp=2022-08-25T18:35:35.642+0200
Then, when you do a GET call against 2022-03-02-preview
(provider >= 3.12.0
), it returns this:
2022-08-25T18:48:20.258+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Request:
GET /subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-03-02-preview HTTP/1.1
2022-08-25T18:48:20.732+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Response for https://management.azure.com/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-03-02-preview:
{
"currentOrchestratorVersion": "1.23.5",
}: timestamp=2022-08-25T18:48:20.732+0200
As you can see, orchestratorVersion
is absent. That's why terraform plan/apply will show you that it wants to configure orchestrator_version
:
# module.aks.module.aks_cluster.azurerm_kubernetes_cluster.main will be updated in-place
~ resource "azurerm_kubernetes_cluster" "main" {
id = "/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster"
name = "test-cluster"
tags = {
"myTag" = "myValue"
}
# (25 unchanged attributes hidden)
~ default_node_pool {
name = "system"
+ orchestrator_version = "1.23.5"
tags = {
"myTag" = "myValue"
}
# (19 unchanged attributes hidden)
}
# (5 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Now, if you run terraform apply, it'll make a PUT call against the new API:
2022-08-25T18:48:38.983+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Request:
PUT /subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster/agentPools/system?api-version=2022-03-02-preview HTTP/1.1
{"properties":{"availabilityZones":["1"],"count":1,"enableAutoScaling":false,"enableFIPS":false,"enableNodePublicIP":false,"kubeletDiskType":"OS","maxPods":110,"mode":"System","nodeLabels":{},"nodeTaints":[],"orchestratorVersion":"1.23.5","osDiskSizeGB":86,"osDiskType":"Ephemeral","osType":"Linux","tags":{"myTag":"myValue"},"type":"VirtualMachineScaleSets","upgradeSettings":{},"vmSize":"Standard_DS2_v2"}}: timestamp=2022-08-25T18:48:38.983+0200
And response will be:
2022-08-25T18:48:40.494+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Response for https://management.azure.com/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster/agentPools/system?api-version=2022-03-02-preview:
{
"orchestratorVersion": "1.23.5",
"currentOrchestratorVersion": "1.23.5",
}
}: timestamp=2022-08-25T18:48:40.494+0200
After that, Azure API starts returning both fields:
2022-08-25T18:49:10.650+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Request:
GET /subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-03-02-preview HTTP/1.1
2022-08-25T18:49:11.016+0200 [DEBUG] provider.terraform-provider-azurerm_v3.12.0_x5: AzureRM Response for https://management.azure.com/subscriptions/64842ced-4781-416f-81ff-482b7f562581/resourceGroups/aks-rg/providers/Microsoft.ContainerService/managedClusters/test-cluster?api-version=2022-03-02-preview:
HTTP/2.0 200 OK
{
"orchestratorVersion": "1.23.5",
"currentOrchestratorVersion": "1.23.5",
}
}: timestamp=2022-08-25T18:49:11.015+0200
Workaround 1
-
terraform apply
:- With non-Spot node pools, AKS accepts the change, does nothing, terraform state gets updated, everything's fine.
- Spot node pools are trickier. Historically, they were not allowed to be upgraded, and it was enforced not only in Azure API, but also in terraform code. Due to the latter, I believe terraform would not let you apply the change. The upgrades are supported on Azure side since June, so I prepared a tiny patch that lifts the restriction: https://github.com/hashicorp/terraform-provider-azurerm/pull/18124. Without the patch, you'll have to destroy spot node pool first, upgrade the provider version, create the pool again.
Workaround 2
UPD: this PR https://github.com/hashicorp/terraform-provider-azurerm/pull/18130 will make the provider fallback to currentOrchestratorVersion
if orchestratorVersion
is missing.
I am also seeing this on AzureRM 3.23.0.
When I upgrade my AKS cluster in my Terraform, I am seeing the same behavoir.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.