terraform-provider-vcd
terraform-provider-vcd copied to clipboard
Demonstrate how a TKG cluster can be upgrade using TF 3.12.0
@adambarreiro
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Description
Would be nice to have an example of how a TKG cluster upgrade from TKG v 2.2.0 to TKG v.2.4.0 works.
I am reading the current documentation and this bit is particularly confusing:
"Upgrading CSE version with cse_version is not supported as this operation would require human intervention, as stated in the official documentation."
Does it mean that you can upgrade the TKG cluster using "supported_upgrades" while keeping the Kubernetes Components to the version currently installed in the TKG cluster?
But if you wan to have lastest K8s components you must first upgrade the components and then use supported_upgrades
?
Would be great to see how the upgrade works in practice. Thanks
New or Affected Resource(s)
vcd_cse_kubernetes_cluster
Terraform Configuration (if it applies)
NA
References
https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/cse_kubernetes_cluster#updating
Hi @returntrip,
Does it mean that you can upgrade the TKG cluster using "supported_upgrades" while keeping the Kubernetes Components to the version currently installed in the TKG cluster?
Yes. The OVA names that appear there can be used to upgrade the cluster.
But if you wan to have lastest K8s components you must first upgrade the components and then use supported_upgrades?
Also yes. The components mentioned in that documentation page (CAPVCD, CSI, CPI versions) are calculated during cluster creation phase with the cse_version
argument (as these are obtained from CSE configuration). At the moment, cse_version
cannot be changed (for example, from 4.1.0
to 4.2.0
) as it requires this manual step.
In other words, these components only need to be updated if you created the cluster with CSE 4.1.X and you recently updated to 4.2.X. So, if you are not upgrading CSE in VCD, but only your cluster, you can use supported_upgrades
for that and no other step is required.
Thanks for the quick reply. It would be great to see an example with the end to end process to upgrade a TKG cluster and its K8s components from say C.S.E 4.1.1 to CSE 4.2.0. It would be especially good to see a TF example code template.
On Tue, 26 Mar 2024, at 11:21, Adam Barreiro wrote:
Hi @returntrip https://github.com/returntrip,
Does it mean that you can upgrade the TKG cluster using "supported_upgrades https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/cse_kubernetes_cluster#supported_upgrades" while keeping the Kubernetes Components to the version currently installed in the TKG cluster?
Yes. The OVA names that appear there can be used to upgrade the cluster.
But if you wan to have lastest K8s components you must first upgrade the components https://docs.vmware.com/en/VMware-Cloud-Director-Container-Service-Extension/4.1/VMware-Cloud-Director-Container-Service-Extension-Using-Tenant-4.1/GUID-092C40B4-D0BA-4B90-813F-D36929F2F395.html and then use supported_upgrades?
Also yes. The components mentioned in that documentation page (CAPVCD, CSI, CPI versions) are calculated during cluster creation phase with the
cse_version
argument (as these are obtained from CSE configuration). At the moment,cse_version
cannot be changed (for example, from4.1.0
to4.2.0
) as it requires this manual step.In other words, these components only need to be updated if you created the cluster with CSE 4.1.X and you recently updated to 4.2.X. So, if you are not upgrading CSE in VCD, but only your cluster, you can use
supported_upgrades
for that and no other step is required.— Reply to this email directly, view it on GitHub https://github.com/vmware/terraform-provider-vcd/issues/1237#issuecomment-2020043039, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADLHVOXBCA5DCK7WWDPESTLY2E42PAVCNFSM6AAAAABFHJU5JOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQGA2DGMBTHE. You are receiving this because you were mentioned.Message ID: @.***>
I am trying to upgrade from 'Ubuntu 20.04 and Kubernetes v1.26.8+vmware.1' to 'Ubuntu 20.04 and Kubernetes v1.27.5+vmware.1' which are both TKG 2.4.0. But I get this error:
Error: Kubernetes cluster update failed: cannot perform an OVA change as the new one 'Ubuntu 20.04 and Kubernetes v1.27.5+vmware.1' has an older TKG/Kubernetes version (v2.4.0/v1.27.5+vmware.1) │ │ with vcd_cse_kubernetes_cluster.my_cluster, │ on main.tf line 73, in resource "vcd_cse_kubernetes_cluster" "my_cluster": │ 73: resource "vcd_cse_kubernetes_cluster" "my_cluster" {
According to the GUI Upgrade this upgrade path is possible (and it should logically be as k8s v1.27.5 is > v1.26.8. Am I doing somethign wrong?
Hi @returntrip, thanks for spotting that. It is, indeed and unfortunately, a bug. I'll start preparing and testing a fix, could you please file an Issue in this repo with this one? I will reference it in the fix.
Thank you very much for the feedback.
As per the upgrade from C.S.E 4.1.1 to CSE 4.2.0, I'm working on that.
Update: What I wanted to do is basically unsupported:). I was trying to go minor version +2 (i.e. 1.26 to 1.28). The relevant thing do is that the erros should say something like: "Unsupported Version. K8s version is too new"
hi @adambarreiro I have done some quick testing (using TFV 3.12.1) for this:
- Pre upgrade K8s V:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-pool-1-7df854c59x66c5z-gg62n Ready <none> 18d v1.26.8+vmware.1
testtf-control-plane-node-pool-x8zp8 Ready control-plane 18d v1.26.8+vmware.1
- Pre Upgrade components Vs:
CAPVCD Version
v1.2.0
Projector Version
0.7.0
Cluster Resource Set Bindings
csi-controller-crs-cm, csi-node-crs-cm, csi-driver-crs-cm, cloud-director-crs-cm
CPI (Cloud Provider Interface)
cloud-controller-manager 1.5.0
CSI (Container Storage Interface)
cloud-director-named-disk-csi-driver 1.5.0
- Upgrade Components using CSE documentation. New Versions:
CAPVCD Version
v1.3.0
Projector Version
0.7.0
Cluster Resource Set Bindings
csi-controller-crs-cm, csi-node-crs-cm, csi-driver-crs-cm, cloud-director-crs-cm
CPI (Cloud Provider Interface)
cloud-controller-manager 1.6.0
CSI (Container Storage Interface)
cloud-director-named-disk-csi-driver 1.6.0
- Modified terraform yaml to: a) upgrade to TKG 2.4.0 k8s v1.28.4:
# Fetch a valid Kubernetes template OVA. If it's not valid, cluster creation will fail.
data "vcd_catalog_vapp_template" "tkg_ova" {
org = data.vcd_catalog.tkg_catalog.org
catalog_id = data.vcd_catalog.tkg_catalog.id
name = "Ubuntu 22.04 and Kubernetes v1.28.4+vmware.1"
}
b) kept cse version to 4.2.0 (this cannot be changed as per TFV doc and even if I try the cluster woudl be destroyed)
-
terraform plan
looks good
data.vcd_org_vdc.vdc: Reading...
data.vcd_vm_sizing_policy.tkg_small: Reading...
data.vcd_catalog.tkg_catalog: Reading...
vcd_api_token.token: Refreshing state... [id=urn:vcloud:token:ce0a1d1c-db68-4af2-8b2b-5631a1bd9eb0]
data.vcd_vm_sizing_policy.tkg_small: Read complete after 0s [id=urn:vcloud:vdcComputePolicy:b7a2da5b-fb02-499d-b7c4-716b38a74655]
data.vcd_catalog.tkg_catalog: Read complete after 2s [id=urn:vcloud:catalog:65883467-0411-499b-bb2b-1b3da10c0264]
data.vcd_catalog_vapp_template.tkg_ova: Reading...
data.vcd_org_vdc.vdc: Read complete after 3s [id=urn:vcloud:vdc:95ba0bf5-3cc3-4bc9-89d8-7afa98f0ceaa]
data.vcd_nsxt_edgegateway.egw: Reading...
data.vcd_storage_profile.sp: Reading...
data.vcd_catalog_vapp_template.tkg_ova: Read complete after 2s [id=urn:vcloud:vapptemplate:a953b693-9948-49c2-9d4e-304f6e26f715]
data.vcd_storage_profile.sp: Read complete after 1s [id=urn:vcloud:vdcstorageProfile:eac68eeb-548e-409a-a85b-9ddb98c944be]
data.vcd_nsxt_edgegateway.egw: Read complete after 1s [id=urn:vcloud:gateway:d395f288-1dfa-419b-ab97-b0c546d61586]
data.vcd_network_routed_v2.routed: Reading...
data.vcd_network_routed_v2.routed: Read complete after 1s [id=urn:vcloud:network:2ef56ce0-0ea9-4078-ba7e-0a6dab9afd3e]
vcd_cse_kubernetes_cluster.my_cluster: Refreshing state... [id=urn:vcloud:entity:vmware:capvcdCluster:40440734-4c4a-42f3-a1a0-8538bd0e3bdf]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# vcd_cse_kubernetes_cluster.my_cluster will be updated in-place
~ resource "vcd_cse_kubernetes_cluster" "my_cluster" {
id = "urn:vcloud:entity:vmware:capvcdCluster:40440734-4c4a-42f3-a1a0-8538bd0e3bdf"
~ kubernetes_template_id = "urn:vcloud:vapptemplate:d221ce24-1f2b-4a92-aef0-245d8606d7e7" -> "urn:vcloud:vapptemplate:a953b693-9948-49c2-9d4e-304f6e26f715"
name = "testtf"
# (21 unchanged attributes hidden)
# (3 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
-
terraform apply
no joycannot perform an OVA change as the new one 'Ubuntu 22.04 and Kubernetes v1.28.4+vmware.1' has an older TKG/Kubernetes version (v2.5.0/v1.28.4+vmware.1)
data.vcd_catalog.tkg_catalog: Reading...
data.vcd_vm_sizing_policy.tkg_small: Reading...
data.vcd_org_vdc.vdc: Reading...
vcd_api_token.token: Refreshing state... [id=urn:vcloud:token:ce0a1d1c-db68-4af2-8b2b-5631a1bd9eb0]
data.vcd_vm_sizing_policy.tkg_small: Read complete after 0s [id=urn:vcloud:vdcComputePolicy:b7a2da5b-fb02-499d-b7c4-716b38a74655]
data.vcd_catalog.tkg_catalog: Read complete after 1s [id=urn:vcloud:catalog:65883467-0411-499b-bb2b-1b3da10c0264]
data.vcd_catalog_vapp_template.tkg_ova: Reading...
data.vcd_org_vdc.vdc: Read complete after 3s [id=urn:vcloud:vdc:95ba0bf5-3cc3-4bc9-89d8-7afa98f0ceaa]
data.vcd_nsxt_edgegateway.egw: Reading...
data.vcd_storage_profile.sp: Reading...
data.vcd_catalog_vapp_template.tkg_ova: Read complete after 2s [id=urn:vcloud:vapptemplate:a953b693-9948-49c2-9d4e-304f6e26f715]
data.vcd_storage_profile.sp: Read complete after 1s [id=urn:vcloud:vdcstorageProfile:eac68eeb-548e-409a-a85b-9ddb98c944be]
data.vcd_nsxt_edgegateway.egw: Read complete after 1s [id=urn:vcloud:gateway:d395f288-1dfa-419b-ab97-b0c546d61586]
data.vcd_network_routed_v2.routed: Reading...
data.vcd_network_routed_v2.routed: Read complete after 1s [id=urn:vcloud:network:2ef56ce0-0ea9-4078-ba7e-0a6dab9afd3e]
vcd_cse_kubernetes_cluster.my_cluster: Refreshing state... [id=urn:vcloud:entity:vmware:capvcdCluster:40440734-4c4a-42f3-a1a0-8538bd0e3bdf]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# vcd_cse_kubernetes_cluster.my_cluster will be updated in-place
~ resource "vcd_cse_kubernetes_cluster" "my_cluster" {
id = "urn:vcloud:entity:vmware:capvcdCluster:40440734-4c4a-42f3-a1a0-8538bd0e3bdf"
~ kubernetes_template_id = "urn:vcloud:vapptemplate:d221ce24-1f2b-4a92-aef0-245d8606d7e7" -> "urn:vcloud:vapptemplate:a953b693-9948-49c2-9d4e-304f6e26f715"
name = "testtf"
# (21 unchanged attributes hidden)
# (3 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
vcd_cse_kubernetes_cluster.my_cluster: Modifying... [id=urn:vcloud:entity:vmware:capvcdCluster:40440734-4c4a-42f3-a1a0-8538bd0e3bdf]
╷
│ Error: Kubernetes cluster update failed: cannot perform an OVA change as the new one 'Ubuntu 22.04 and Kubernetes v1.28.4+vmware.1' has an older TKG/Kubernetes version (v2.5.0/v1.28.4+vmware.1)
│
│ with vcd_cse_kubernetes_cluster.my_cluster,
│ on main.tf line 62, in resource "vcd_cse_kubernetes_cluster" "my_cluster":
│ 62: resource "vcd_cse_kubernetes_cluster" "my_cluster" {
│