unhealthyMachineTimeout not working when VM is powered off (VM not deleted from disk)
What happened:
I have created EKSA Cluster with following configuration,
- unhealthyMachineTimeout set to 30 seconds (minimum value) in the Cluster config file Worker node section
- Enabled Autoscaling configuration in cluster config file for worker nodes
- Installed Cluster Autoscaler curated package on the cluster
I went through two scenarios post cluster creation,
- Scenario 1: Navigate to VMWare vSphere console, Click on one of worker node, Right Click and Power Off
- Scenario 2: Click on one of worker node, Right Click > Power Off, Right click again > Delete from the disk
Scenario 1 fails all the time. No new node is created. capv pod logs do not show any event that node is unhealthy until 4-5 minutes. And then, node either gets deleted and new node is provisioned or node gets powered on.
Scenario 2 works all the time. Post deletion of node, new node gets provisioned within 30 seconds.
[1] https://anywhere.eks.amazonaws.com/docs/getting-started/optional/healthchecks/#machinehealthcheckunhealthymachinetimeout-optional
What you expected to happen:
For scenario 1, capv should respect unhealthyMachineTimeout 30 seconds value. When unhealthyMachineTimeout is set to 5 minutes, capv takes around 20-40 minutes to realize the node is powered off or not ready.
I am not sure if we need something like a node termination handler that Amazon EKS on cloud has.
How to reproduce it (as minimally and precisely as possible):
- Configure worker node section of Cluster config file as following.
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: VSphereMachineConfig
name: demo-mgmt
name: md-0
autoscalingConfiguration:
minCount: 1
maxCount: 5
machineHealthCheck:
unhealthyMachineTimeout: 30s
maxUnhealthy: 100%
Anything else we need to know?:
Environment: EKSA with vSphere
- EKS Anywhere Release: 0.20
Version: v0.20.4
Release Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml Bundle Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/74/manifest.yaml
- EKS Distro Release: not sure