cilium-cli
cilium-cli copied to clipboard
Connectivity tests pods may be unschedulable in a two nodes cluster
Hit on a CI run: https://github.com/cilium/cilium-cli/actions/runs/7259322671/job/19826588396?pr=2194
The test cluster has four nodes, but Cilium is deployed on only two, while the other two are reserved for extra tests. Which is equivalent to a two nodes cluster.
Pods:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
test-namespace client-5f8f776644-pxh7p 0/1 Pending 0 5m1s <none> <none>
test-namespace client2-868c49bf66-48rfv 0/1 Pending 0 5m1s <none> <none>
test-namespace client3-674cf46fd5-cf9d2 1/1 Running 0 5m2s 10.244.0.89 chart-testing-control-plane
test-namespace echo-external-node-6c447f645f-xwx4l 1/1 Running 0 5m1s 172.18.0.4 chart-testing-worker3
test-namespace echo-other-node-757cbff7b6-pq9bv 2/2 Running 0 5m2s 10.244.3.46 chart-testing-worker
test-namespace echo-same-node-6cc6494564-sxngz 0/2 Pending 0 5m1s <none> <none>
Specifically:
- The
echo-other-node
pod is scheduled onchart-testing-worker
, with a required anti-affinity targeting theclient
pod; - The
client3
pod is scheduled onchart-testing-control-plane
, with a required anti-affinity targeting theclient
pod;
Which makes it impossible to schedule the client
pod, as both ready nodes are forbidden by the anti-affinity rules:
0/4 nodes are available: 2 node(s) didn''t match pod affinity/anti-affinity,
2 node(s) didn''t satisfy existing pods anti-affinity rules, 2 node(s) had
taint {node.kubernetes.io/not-ready: }, that the pod didn''t tolerate.
And in turn the client2
and echo-same-node pods
are also unschedulable, because of the required affinity targeting the client
pod.
The `client3 pod got recently introduced in https://github.com/cilium/cilium-cli/pull/2183.