karpenter icon indicating copy to clipboard operation
karpenter copied to clipboard

Choose smaller spots instead of big on-demand when possible

Open liorfranko opened this issue 1 year ago • 9 comments

Version

Karpenter Version: v0.23.0

Kubernetes Version: v1.24.0

Expected Behavior

When there is a request for multiple pods on one big spot, and there is no spot capacity, instead of fallback to on-demand, split the request to smaller spots.

Actual Behavior

During a large scale-up, Karpenter received multiple requests for deploying 2 pods on each node that can fit 24xlarge. There was InsufficientInstanceCapacity of 24xlarge, and instead of splitting the request to 2 spots 12xlarge, it choose 24xlarge on-demand.

Steps to Reproduce the Problem

Deploy large scale provisioner of 2 pods that can fit 24xlarge or more, and exclude instances larger than 24xlarge from the provisioner.

Resource Specs and Logs

2023-02-27T17:56:16.918Z	INFO	controller.provisioner	launching node with 2 pods requesting {"cpu":"91705m","memory":"178356Mi","pods":"13"} from types m6idn.24xlarge, m6i.24xlarge, m6in.24xlarge, m6a.24xlarge, m6id.24xlarge and 5 other(s)	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot"}
2023-02-27T17:56:16.918Z	INFO	controller.provisioner	launching node with 2 pods requesting {"cpu":"91705m","memory":"178356Mi","pods":"13"} from types m6idn.24xlarge, m6i.24xlarge, m6in.24xlarge, m6a.24xlarge, m6id.24xlarge and 5 other(s)	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot"}

2023-02-27T17:56:21.248Z	DEBUG	controller.provisioner.cloudprovider	removing offering from offerings	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot", "unavailable-reason": "InsufficientInstanceCapacity", "instance-type": "m6a.24xlarge", "zone": "us-east-1e", "capacity-type": "on-demand", "unavailable-offerings-ttl": "3m0s"}
2023-02-27T17:56:21.248Z	DEBUG	controller.provisioner.cloudprovider	removing offering from offerings	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot", "unavailable-reason": "InsufficientInstanceCapacity", "instance-type": "m6a.24xlarge", "zone": "us-east-1e", "capacity-type": "on-demand", "unavailable-offerings-ttl": "3m0s"}


2023-02-27T17:56:21.522Z	INFO	controller.provisioner.cloudprovider	launched new instance	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot", "id": "i-063a1b84f7a1a9173", "hostname": "ip-10-207-10-112.ec2.internal", "instance-type": "m6i.24xlarge", "zone": "us-east-1b", "capacity-type": "on-demand"}
2023-02-27T17:56:21.559Z	INFO	controller.provisioner.cloudprovider	launched new instance	{"commit": "5a7faa0-dirty", "provisioner": "ds-infra-network-online-spot", "id": "i-0d2cc2d44e98f6872", "hostname": "ip-10-207-11-85.ec2.internal", "instance-type": "m6i.24xlarge", "zone": "us-east-1b", "capacity-type": "on-demand"}

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

liorfranko avatar Feb 27 '23 18:02 liorfranko