amazon-eks-ami icon indicating copy to clipboard operation
amazon-eks-ami copied to clipboard

[bug] script allows to pass incorrect kublete arguments

Open AlexNabokikh opened this issue 9 months ago • 1 comments

Hey!

Issue

Recently, at the company I work for we had an incident caused by incorrect arguments being passed to the kubelet via --kubelet-extra-args in the EKS terraform configuration. These arguments are being passed to the bootstrap.sh by the Terraform provider, and it seems that the script accepts incorrect arguments but doesn't check later if the kubelet has started. Nodes with incorrect kubelet arguments cannot start the kubelet and thus join the cluster. Despite the above, EKS does not consider such nodes unhealthy.

Proposed solution

Add checks to determine whether the kubelet has started or not.

Very roughly:

if systemctl is-active --quiet kubelet; then
  log "INFO: kubelet service is active and running."
else
  log "ERROR: kubelet service failed to start."
  exit 1  # Exit if kubelet did not start successfully
fi

AlexNabokikh avatar May 10 '24 07:05 AlexNabokikh