cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Unable to create Cortex cluster if spot instances are not available and min_instances > 0

Open deliahu opened this issue 4 years ago • 0 comments

When creating a cluster which uses 100% spot instances (with or without on_demand_backup) and has min_instances > 0, if spot instances are not available, cluster creation hangs in eksctl. The Autoscaling Group event logs have these messages:

At 2020-08-07T01:01:51Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1.

Launching a new EC2 instance. Status Reason: Could not launch Spot Instances. InsufficientInstanceCapacity - There is no Spot capacity available that matches your request. Launching EC2 instance failed.

Here is the relevant cluster configuration which led to this. However it only will be an issue if min_instances > 0 and spot instances for the requested instance type are not available in the desired region at the time the cluster is created. If spot instances become unavailable after the cluster is created, there will not be an issue as long as on_demand_backup is set to true.

min_instances: 1
instance_type: g4dn.xlarge
spot: true
spot_config:
  on_demand_base_capacity: 0
  on_demand_percentage_above_base_capacity: 0
  on_demand_backup: true

Since this only happens with min_instances > 0, a workaround would be to start with min_instances: 0, and then increase it after the cluster is running.

deliahu avatar Aug 11 '20 22:08 deliahu