amazon-eks-custom-amis icon indicating copy to clipboard operation
amazon-eks-custom-amis copied to clipboard

Autoscaling for EKS AL2 shows EC2 health issue: Terminates instance on a weekly basis.

Open ygoodmn opened this issue 4 years ago • 2 comments

What happened: Using the AL2 to create an EKS instance added to EKS Autoscaling Group in a Launch configuration. On a weekly basis the instance terminates from the autoscaling: With the following error: "<DATE> an instance was taken out of service in response to an EC2 health check indicating it has been terminated or stopped."

AWS support states that the Instance is running : "I have confirmed that the shutdown -h command (Client.InstanceInitiatedShutdown) was called within the instances, which is the reason they were stopped."

What you expected to happen: System continue to run

How to reproduce it (as minimally and precisely as possible): Wait one week, and terminated by Autoscaling ?

Anything else we need to know?: I enabled termination protection, but auto-scaling by-passes this.

Environment:

  • OS: Amazon Linux 2 dracut-033-535.amzn2.1.4
  • OS Version:
  • EKS Version: 1.18
  • Packer Version: 1.7.2

ygoodmn avatar Aug 25 '21 11:08 ygoodmn

I ran into this issue as well and traced it to this line: https://github.com/aws-samples/amazon-eks-custom-amis/blob/main/scripts/al2/cis-benchmark.sh#L359

This line makes the instance automatically terminate if the audit logs become "too full", and the line after makes it keep audit logs perpetually, filling up disk space as well

voidlily avatar Aug 31 '21 21:08 voidlily

@ygoodmn Can you share how you add it through your launch configuration? I seems have an issue, add the built image to EKS 1.18, the node would not start, or start, immediately stopped and terminated.

mxie1563 avatar Oct 12 '21 00:10 mxie1563

this should be resolved now - please feel free to open a new issue if encountering any issues with nodes joining the cluster

bryantbiggs avatar Oct 09 '23 16:10 bryantbiggs