terraform-aws-components
terraform-aws-components copied to clipboard
Spacelift Worker Pool ASG may fail to scale due to ami/instance type mismatch
Found a bug? Maybe our Slack Community can help.
Describe the Bug
With the recent addition of spacelift worker pool support for arm64, the data source filters that return the ami will sometimes return the arm64 image rather than the x86_64 image. This will result in failures to start new instances in the autoscaling group whenever the arm64 ami is returned first and autoscaling groups will generate errors.
Expected Behavior
Prior to Spacelift's release of the arm64 AMIs, all spacelift worker pool instances launched were x86_64. One expects the same behavior before and after the release of arm64 images. In the future when arm64 support for geodesic and terraform module releases are more widespread, some may chose to switch to arm64, but one never wants to flip flop randomly, as the instance type and the ami must always match. At present the instance type is set statically in yaml, so this can be forced to x86_64.
Steps to Reproduce
Steps to reproduce the behavior:
- At the moment this is being written, all ASGs that fire in aws us-east-1 are probably failing due to the arm64 returning first from the filter
- Look at the CloudTrail logs when triggering auto-scaling
- Watch the worker pool. You may see active and busy held down at the minimum level while the pending remains high for an hour or more
Screenshots
Environment (please complete the following information):
- Spacelift on Feb 25th 2023
- Terraform 1.3.8
- us-east-1 AMIs
Additional Context
- https://spacelift.io/changelog/en/arm-private-worker-pools-are-here-2HC4a1tls