terraform-aws-fck-nat icon indicating copy to clipboard operation
terraform-aws-fck-nat copied to clipboard

Improve spot HA by utilising ASG capacity rebalance

Open kieranbrown opened this issue 5 months ago • 1 comments

Capacity rebalance helps by being proactive in trying to replace Spot Instances before they are interrupted. Full docs - https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-capacity-rebalancing.html


Current Behaviour When a spot instance receives its 2-minute interruption warning nothing happens, the instance is terminated after 2 minutes then a new instance is started after the original is terminated.

New Behaviour When a spot instance receives its 2-minute interruption warning the ASG immediately provisions a new instance which hopefully will boot and move the floating EIP before the original is terminated. With this approach, there is minimal downtime when spot instances are terminated.


You can test this out using AWS Fault Injection, if you go to the EC2 management console then click Spot Requests in the sidebar. You have the option to select a spot request, click actions then Initiate Interruption

kieranbrown avatar Jan 23 '24 12:01 kieranbrown