cluster-api-provider-cloud-director
cluster-api-provider-cloud-director copied to clipboard
CAPVCD does not reconcile VMs that have been manually powered off/deleted in VCD
Describe the bug
If a tenant accidently powers off / deletes a VM from Cloud Director, I would expect CAPVCD to try and reconcile this by powering on the VM if it still exists, or by creating a new VM if it has been deleted.
However, we observed that no attempt was made whatsoever to reconcile the VMs.
CAPVCD sees that the node is unavailable, but there are no logs in the controller or in Cloud Director indicating that it it trying to fix the state:
jamesm@LAPTOP-MK0G58QB:~/tce-config-files$ kubectl get machinedeployment demo-work-pool-1
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
demo-work-pool-1 demo 2 1 2 1 ScalingUp 24h v1.21.14+vmware.2
Reproduction steps
- Deploy a CAPVCD workload cluster.
- Power off one of the worker nodes in the cluster from the Cloud Director UI
- Observe that the worker node is left powered off until it is manually powered on again.
Expected behavior
- Deploy a CAPVCD workload cluster.
- Power off one of the worker nodes in the cluster from the Cloud Director UI
- CAPVCD detects that the node is powered off and powers it on without manual intervention. Same behaviour if the node is deleted manually.
Additional context
No response
This issue will be fixed in the next release of CAPVCD.