AutoSpotting
AutoSpotting copied to clipboard
ValidationError: To attach 1 instance, please update the AutoScalingGroup sizes appropriately
Github issue
Issue type
Bug Report
Build number
ca81828f91d7cddc597ba750890ad5a5e9b98464 with no changes
Environment
- AWS region: us-east-1
- Type of environment: VPC
Summary
When enabling AutoSpotting for the first time against a few ASGs, I see the following error in the logs:
autoscaling.go:447: ValidationError: AutoScalingGroup foo has min-size=3, max-size=4, and desired-size=4. To attach 1 instance, please update the AutoScalingGroup sizes appropriately.
So far I've seen this twice for different ASGs (one time each for 2 ASGs).
One ASG had autospotting_min_on_demand_number
set to 1, but all on-demand instances have been replaced with spot. This is concerning, although I'm not sure it's related to this ValidationError. From what I can tell, there were no other autoscaling events going on during this time. The correct number of on-demand instances was maintained after I manually terminated one of the spot instances (AutoSpotting hasn't tried to replace it).
The other ASG seems to have automatically recovered from that error and has the correct number of spot instances (100%). It was undergoing a scaling event at the same time as AutoSpotting, so perhaps this instance of the error is expected.
Even so, does AutoSpotting fail to work when the ASG is at MaxSize?
Steps to reproduce
Not sure yet.
Expected results
No error, correct number of on-demand instances.
Actual results
Error, incorrect number of on-demand instances.
Historically, when at maximum capacity, we used to swap the attach/detach actions so that we detach first the on demand instance in order to bring the group to lower desired capacity, and then the attach call worked.
For some reason it looks like we're not doing it anymore.
Any reason not to temporarily increase the maxsize when that happens?
We do this for the case when min=max=desired. With swapped detach/attach actions it should not be necessary and I wouldn't interfere with the max unless there's no other way, but I am open to doing it if there's really no other way.
I wonder what made this stop working.
This should no longer be an issue, the handling of min/max/desired capacity is heavily improved, especially in the private fork that powers the release from the AWS Marketplace.