karpenter-provider-aws
karpenter-provider-aws copied to clipboard
Feature request: More flexible requirements for NodePools
Description
What problem are you trying to solve?
I have a generic NodePool that's used by a majority of my workloads, which has pretty lose requirements, except that it puts a floor on the number of CPUs so that my cluster is comprised of fewer, larger nodes (instead of more nodes that are smaller):
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: arm
spec:
template:
spec:
nodeClassRef:
name: default
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "kubernetes.io/arch"
operator: In
values: ["arm64"]
- key: "karpenter.sh/capacity-type"
operator: In
values: ["spot", "on-demand"]
- key: "kubernetes.io/os"
operator: In
values: ["linux"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: Gt
values: ["15"]
weight: 90
However, I want to make an exception to this idea of preferring larger nodes for memory-optimized instances: I'd like to allow for a minimum of 7 CPUs specifically for the r instance category, such that a r6g.2xlarge could be used. There does not seem to be a good way to fit this exception in my existing NodePool, which would require me to make another one, and update specific workloads to select that NodePool instead.
While this works, it would be nice if I could have more flexible node requirements, so I could say something like: "The c and m instance categories should have a minimum of 15 CPUs, but the r instance category can have a minimum of 7 CPUs"
How important is this feature to you?
It would be nice to have so I didn't have to manage more nodepools and specific requirements for some workloads.
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
This is pretty interesting! You effectively want a set of sets of requirements to de-dupe the number of NodePools you have, since everything else should be the same. Considering changing this from a list to a list of lists would be a breaking change, this would probably have to be implemented in a non-breaking way, such as adding a field that indicates which requirements go with which groups.
Are there other benefits besides ease of configuration?
Hi @njtran, sorry for the late response. I am mostly interested in the ease of configuration, but I'm also wary of just making another NodePool with relaxed requirements for memory-optimized instances due to this note in the karpenter docs:
It is recommended to create NodePools that are mutually exclusive. So no Pod should match multiple NodePools. If multiple NodePools are matched, Karpenter will use the NodePool with the highest weight.
So right now, my "highmem" NodePools have a lower weight, and I'm using affinity for memory-intensive workloads to select these NodePools. More flexible requirements would allow me to remove this extra NodePool, and allow karpenter to schedule my memory-intensive workloads as it sees fit, with extra options for memory-optimized instances.
Another way to frame this: currently, requirements is a one-dimensional list of predicates that are AND-ed together (please correct me if my understanding is incorrect). That makes it similar to a Pod nodeSelector. But there is nothing in Karpenter that provides the flexibility of something like nodeAffinity.
The only workaround I can think of, assuming you don't want multiple NodePools--since as @mrparkers points out that has downsides--is to have your Pods define a nodeAffinity and then let the NodePool requirements be more general. But that is of course not ideal.
I'm going to close this issue here since the NodePool API is owned by the upstream project, kubernetes-sigs/karpenter, but feel free to reopen there. To go forward with this feature, we would want to see a RFC outlining the root problem and exploring potential design spaces in that repo.