zero-to-jupyterhub-k8s
zero-to-jupyterhub-k8s copied to clipboard
Config `extraNodeAffinity` and `matchNodePurpose` combines incorrectly (OR instead of AND)
Bug description
The below configuration option doesn't do what one would expect and it seems like a bug.
singleuser:
extraNodeAffinity:
required:
- matchExpressions:
- key: "kubernetes.azure.com/scalesetpriority"
operator: NotIn
values: [spot]
scheduling:
userPods:
nodeAffinity:
matchNodePurpose: require
This generates 2 different match expressions as you can see below. If either of them match the pod is scheduled.
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/scalesetpriority
operator: In
values:
- spot
- matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
Naively I would expect this to generate a single match expression that is the union of the two like so:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.azure.com/scalesetpriority
operator: In
values:
- spot
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
I'm running 3.0.2 on AKS 1.27.3.
The following works around the issue but is ugly IMO:
singleuser:
extraNodeAffinity:
required:
- matchExpressions:
- key: "kubernetes.azure.com/scalesetpriority"
operator: NotIn
values: [spot]
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
scheduling:
userPods:
nodeAffinity:
matchNodePurpose: ignore
I think I agree with that this is a bug, where the bug is that matchNodePurpose=required configuration doesn't combine with extraNodeAffinity.required as one would assume - requiring either rather than both.
There is probably also then a bug for matchNodePurpose=preferred in how it combines with extraNodeAffinity.preferred as well. It is probably also then a bug for user pods, core node pods, etc.
As an aside I tried this (which is what I think the syntax should be) and it creates an invalid pod yaml:
extraNodeAffinity:
required:
- key: "kubernetes.azure.com/scalesetpriority"
operator: NotIn
values: [spot]
I think we shouldn't change how extraNodeAffinity.required works to fix this - whats put there is a "pure" k8s specification we inject.
I think instead we could consider merging in the section from matchNodePurpose into the "pure" k8s specification if there is one, or fallback to generating one. Is this practically feasible? Does it make sense for ~all scenarios? Is a change breaking?
This is a very tricky config to provide overall, and I'm not sure we can resolve this in a good way at the moment.
It's certainly a breaking change if someone relies on todays behavior but I don't see how that could be possible. Either way I have a workaround so I'm not chomping for a solution, In the open source space this is very rarely used and almost never combined with the scheduler based on this github search:
extraNodeAffinity language:YAML NOT path:tools/templates NOT path:*/schema.yaml NOT is:fork
It's probably fine to document this oddness if it's not really fixable.