kubenurse
kubenurse copied to clipboard
exclude neighbourhood pods from Nodes where scheduling is disabled
Problem statement
Currently kubenurse discovers all running neighbour Pods (see kubediscovery.go). If we perform maintenance on a Node it is possible that the kubenurse instance on this node can't be reached - which is not neccesairly a problem. Thus graphs/metrics might show errors (or even trigger false alarms).
Proposal
Exclude kubenurse instances from Nodes where scheduling is disabled.
Further enhancement
Disable checks entirely on a kubenurse instance if the node the instance runs on has scheduling disabled (to avoid possible service check errors for example).
Due to the fact that kubenurse runs as a DaemonSet we're not able to evict pods from Nodes which are not schedulable: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#taints-and-tolerations. Which means we need to solve this in the discovery mechanism of kubenurse.
Neighbour pods are now excluded with #13. Let's now find a way to disable checks if the node is unschedulable.
@ghouscht
as there weren't many interaction on this issue, I think that there is not enough interest in disabling checks when the node is unschedulable.
also, given the default tolerations of the daemonset, when a node is NotReady
, with the node.kubernetes.io/unreachable:NoExecute
taint, the pod will be deleted.
if someone needs this feature in the future, feel free to reopen 🙃