descheduler
descheduler copied to clipboard
LowNodeUtilization doesn't check nodeSelector/nodeAffinity when choosing pods to evict
What version of descheduler are you using?
descheduler version: 0.24.1
Does this issue reproduce with the latest release?
Yes.
Which descheduler CLI options are you using?
- --policy-config-file
- /policy-dir/policy.yaml
- --v
- "3"
Please provide a copy of your descheduler policy config file
apiVersion: "descheduler/v1alpha1"
kind: "DeschedulerPolicy"
evictLocalStoragePods: true
ignorePvcPods: true
maxNoOfPodsToEvictPerNamespace: 1
maxNoOfPodsToEvictPerNode: 1
strategies:
LowNodeUtilization:
enabled: true
params:
nodeFit: true
nodeResourceUtilizationThresholds:
targetThresholds:
cpu: 50
memory: 50
pods: 50
thresholds:
cpu: 20
memory: 20
pods: 20
What k8s version are you using (kubectl version
)?
kubectl version
Output
$ kubectl version Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.14", GitCommit:"57a3aa3f13699cf3db9c52d228c18db94fa81876", GitTreeState:"clean", BuildDate:"2021-12-15T14:52:33Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.14", GitCommit:"57a3aa3f13699cf3db9c52d228c18db94fa81876", GitTreeState:"clean", BuildDate:"2021-12-15T14:47:10Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
What did you do?
I have 4 nodes in the cluster (node1
and node2
are overutilized, node3
are underutilized):
NAME LABELS
node1 role=worker
node2 role=worker
node3 role=worker2
All pods (except created by DaemonSets) has nodeSelector with role=worker
or role=worker2
.
Then I run descheduler with config above.
What did you expect to see?
Descheduler do nothing (pods from node1
and node2
doesn't fit on node3
because of nodeSelector).
What did you see instead?
Descheduler evicts pods from node1
and node2
every launch.
@seleznev thanks for this, could you share the logs from the descheduler showing this eviction? Ideally at v=4
log level. That should give us an idea about why it's doing these evictions.
I tried to clean up the cluster, but still there's a lot of trash in the logs, sorry. :( Also, I removed node4
from the description to match logs below.
--v=4 --dry-run
I0714 15:14:14.519845 49426 named_certificates.go:53] "Loaded SNI cert" index=0 certName="self-signed loopback" certDetail="\"apiserver-loopback-client@1657800854\" [serving] validServingFor=[apiserver-loopback-client] issuer=\"apiserver-loopback-client-ca@1657800854\" (2022-07-14 11:14:14 +0000 UTC to 2023-07-14 11:14:14 +0000 UTC (now=2022-07-14 12:14:14.519807269 +0000 UTC))" I0714 15:14:14.519913 49426 secure_serving.go:210] Serving securely on [::]:10258 I0714 15:14:14.519953 49426 tlsconfig.go:240] "Starting DynamicServingCertificateController" I0714 15:14:14.972570 49426 reflector.go:219] Starting reflector *v1.Namespace (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:14.972580 49426 reflector.go:219] Starting reflector *v1.PriorityClass (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:14.972585 49426 reflector.go:219] Starting reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:14.972593 49426 reflector.go:255] Listing and watching *v1.PriorityClass from k8s.io/client-go/informers/factory.go:134 I0714 15:14:14.972597 49426 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134 I0714 15:14:14.972589 49426 reflector.go:255] Listing and watching *v1.Namespace from k8s.io/client-go/informers/factory.go:134 I0714 15:14:15.072905 49426 shared_informer.go:285] caches populated I0714 15:14:15.072936 49426 shared_informer.go:285] caches populated I0714 15:14:15.473308 49426 shared_informer.go:285] caches populated I0714 15:14:15.473471 49426 node.go:49] "Node lister returned empty list, now fetch directly" I0714 15:14:15.569834 49426 descheduler.go:253] Building a cached client from the cluster for the dry run I0714 15:14:15.569875 49426 descheduler.go:120] Pulling resources for the cached client from the cluster I0714 15:14:15.586582 49426 reflector.go:219] Starting reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:15.586597 49426 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134 I0714 15:14:15.686657 49426 shared_informer.go:285] caches populated I0714 15:14:15.686697 49426 descheduler.go:278] Building a pod evictor I0714 15:14:15.687101 49426 nodeutilization.go:224] "Node is overutilized" node="node1" usage=map[cpu:1619m memory:6520041673 pods:22] usagePercentage=map[cpu:40.576441102756895 memory:86.49085210728913 pods:36.666666666666664] I0714 15:14:15.687143 49426 nodeutilization.go:224] "Node is overutilized" node="node2" usage=map[cpu:1483m memory:5123585820 pods:28] usagePercentage=map[cpu:37.16791979949875 memory:67.96632991652716 pods:46.666666666666664] I0714 15:14:15.687171 49426 nodeutilization.go:221] "Node is underutilized" node="node3" usage=map[cpu:251m memory:406994944 pods:3] usagePercentage=map[cpu:6.290726817042606 memory:5.053364510004751 pods:2.727272727272727] I0714 15:14:15.687196 49426 lownodeutilization.go:118] "Criteria for a node under utilization" CPU=20 Mem=20 Pods=20 I0714 15:14:15.687212 49426 lownodeutilization.go:119] "Number of underutilized nodes" totalNumber=1 I0714 15:14:15.687231 49426 lownodeutilization.go:132] "Criteria for a node above target utilization" CPU=50 Mem=50 Pods=50 I0714 15:14:15.687246 49426 lownodeutilization.go:133] "Number of overutilized nodes" totalNumber=2 I0714 15:14:15.687280 49426 nodeutilization.go:277] "Total capacity to be moved" CPU=1744 Mem=3619975040 Pods=52 I0714 15:14:15.687300 49426 nodeutilization.go:280] "Evicting pods from node" node="node1" usage=map[cpu:1619m memory:6520041673 pods:22] I0714 15:14:15.687482 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-node-nb5mz" node="node2" I0714 15:14:15.687504 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.687542 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-node-nb5mz" node="node3" I0714 15:14:15.687561 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.687598 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/topolvm-node-nb5mz" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.687752 49426 node.go:145] "Pod fits on node" pod="kube-system/vpa-manager-8668966dc5-pz226" node="node2" I0714 15:14:15.687788 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/vpa-manager-8668966dc5-pz226" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.687911 49426 node.go:145] "Pod fits on node" pod="kube-system/calico-typha-5458d7dc9b-d8dk2" node="node2" I0714 15:14:15.687956 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/calico-typha-5458d7dc9b-d8dk2" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.688090 49426 node.go:145] "Pod fits on node" pod="kube-system/cluster-autoscaler-74f6bc9c5-zwm87" node="node2" I0714 15:14:15.688118 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/cluster-autoscaler-74f6bc9c5-zwm87" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.688229 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-thanos-query-7bc94f947b-n28rj" node="node2" I0714 15:14:15.688258 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-thanos-query-7bc94f947b-n28rj" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.688370 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-lvmd-0-dxl4c" node="node2" I0714 15:14:15.688387 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.688418 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-lvmd-0-dxl4c" node="node3" I0714 15:14:15.688432 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.688461 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/topolvm-lvmd-0-dxl4c" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.688583 49426 node.go:148] "Pod does not fit on node" pod="kube-system/unbound-th87x" node="node2" I0714 15:14:15.688601 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.688643 49426 node.go:148] "Pod does not fit on node" pod="kube-system/unbound-th87x" node="node3" I0714 15:14:15.688696 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.688728 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/unbound-th87x" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.688840 49426 node.go:145] "Pod fits on node" pod="io/kichay-deploy-test-5565f7b678-2cx4j" node="node2" I0714 15:14:15.688963 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-resource-redis-5998d76c7-fzlpn" node="node2" I0714 15:14:15.688992 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-resource-redis-5998d76c7-fzlpn" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.689131 49426 node.go:145] "Pod fits on node" pod="kube-system/vpa-recommender-8554864b8d-lhkkt" node="node2" I0714 15:14:15.689164 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/vpa-recommender-8554864b8d-lhkkt" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.689285 49426 node.go:145] "Pod fits on node" pod="kube-system/vpa-updater-6b59d8b6df-vvwsp" node="node2" I0714 15:14:15.689312 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/vpa-updater-6b59d8b6df-vvwsp" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.689422 49426 node.go:145] "Pod fits on node" pod="gitlab/makisu-redis-74c597947f-mzb26" node="node2" I0714 15:14:15.689446 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="gitlab/makisu-redis-74c597947f-mzb26" checks="pod has a PVC and descheduler is configured to ignore PVC pods" I0714 15:14:15.689561 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-perf-b65c94486-qfmdj" node="node2" I0714 15:14:15.689592 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-perf-b65c94486-qfmdj" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.689700 49426 node.go:148] "Pod does not fit on node" pod="kube-system/calico-node-qbcw2" node="node2" I0714 15:14:15.689717 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.689753 49426 node.go:148] "Pod does not fit on node" pod="kube-system/calico-node-qbcw2" node="node3" I0714 15:14:15.689768 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.689805 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/calico-node-qbcw2" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.689961 49426 node.go:145] "Pod fits on node" pod="kube-system/csi-rbdplugin-provisioner-8cb6c6b99-5568b" node="node2" I0714 15:14:15.690072 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-blackbox-9f7dcb4f-zhzhr" node="node2" I0714 15:14:15.690101 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-blackbox-9f7dcb4f-zhzhr" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.690220 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-cluster-65c76558c4-4jrlp" node="node2" I0714 15:14:15.690247 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-cluster-65c76558c4-4jrlp" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.690404 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-spilo-5c6899f8d6-qnzq2" node="node2" I0714 15:14:15.690436 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-spilo-5c6899f8d6-qnzq2" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.690575 49426 node.go:148] "Pod does not fit on node" pod="kube-system/kube-proxy-rsps7" node="node2" I0714 15:14:15.690595 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.690636 49426 node.go:148] "Pod does not fit on node" pod="kube-system/kube-proxy-rsps7" node="node3" I0714 15:14:15.690654 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.690689 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-proxy-rsps7" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.690821 49426 node.go:145] "Pod fits on node" pod="io/cassandra-0" node="node2" I0714 15:14:15.690851 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io/cassandra-0" checks="pod has a PVC and descheduler is configured to ignore PVC pods" I0714 15:14:15.690980 49426 node.go:148] "Pod does not fit on node" pod="io-logging/elasticsearch-logs-data-1" node="node2" I0714 15:14:15.691000 49426 node.go:150] "insufficient topolvm.cybozu.com/capacity" I0714 15:14:15.691047 49426 node.go:148] "Pod does not fit on node" pod="io-logging/elasticsearch-logs-data-1" node="node3" I0714 15:14:15.691065 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.691083 49426 node.go:150] "insufficient topolvm.cybozu.com/capacity" I0714 15:14:15.691117 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io-logging/elasticsearch-logs-data-1" checks="[pod has a PVC and descheduler is configured to ignore PVC pods, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.691243 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-state-metrics-555d9dccc-t5ddm" node="node2" I0714 15:14:15.691279 49426 nodeutilization.go:283] "Pods on node" node="node1" allPods=22 nonRemovablePods=19 removablePods=3 I0714 15:14:15.691304 49426 nodeutilization.go:290] "Evicting pods based on priority, if they have same priority, they'll be evicted based on QoS tiers" I0714 15:14:15.691491 49426 evictions.go:161] "Evicted pod in dry run mode" pod="io/kichay-deploy-test-5565f7b678-2cx4j" reason="LowNodeUtilization" strategy="LowNodeUtilization" node="node1" I0714 15:14:15.691515 49426 nodeutilization.go:323] "Evicted pods" pod="io/kichay-deploy-test-5565f7b678-2cx4j" err=I0714 15:14:15.691543 49426 nodeutilization.go:348] "Updated node usage" node="node1" CPU=1609 Mem=6503264457 Pods=21 E0714 15:14:15.691599 49426 nodeutilization.go:318] "Error evicting pod" err="Maximum number 1 of evicted pods per \"node1\" node reached" pod="kube-system/csi-rbdplugin-provisioner-8cb6c6b99-5568b" I0714 15:14:15.691642 49426 nodeutilization.go:294] "Evicted pods from node" node="node1" evictedPods=1 usage=map[cpu:1609m memory:6503264457 pods:21] I0714 15:14:15.691670 49426 nodeutilization.go:280] "Evicting pods from node" node="node2" usage=map[cpu:1483m memory:5123585820 pods:28] I0714 15:14:15.691793 49426 node.go:148] "Pod does not fit on node" pod="kube-system/calico-node-49tnp" node="node1" I0714 15:14:15.691813 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.691861 49426 node.go:148] "Pod does not fit on node" pod="kube-system/calico-node-49tnp" node="node3" I0714 15:14:15.691882 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.691928 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/calico-node-49tnp" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.692045 49426 node.go:145] "Pod fits on node" pod="kube-system/cluster-autoscaler-74f6bc9c5-gzlhp" node="node1" I0714 15:14:15.692064 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/cluster-autoscaler-74f6bc9c5-gzlhp" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.692158 49426 node.go:145] "Pod fits on node" pod="kube-system/csi-rbdplugin-provisioner-8cb6c6b99-pk8rf" node="node1" I0714 15:14:15.692220 49426 node.go:145] "Pod fits on node" pod="io/load-generator-547dd97745-zlmm2" node="node1" I0714 15:14:15.692290 49426 node.go:145] "Pod fits on node" pod="io-vm/alertmanager-0" node="node1" I0714 15:14:15.692318 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io-vm/alertmanager-0" checks="pod has a PVC and descheduler is configured to ignore PVC pods" I0714 15:14:15.692399 49426 node.go:145] "Pod fits on node" pod="kube-system/eventrouter-57b9b4cd47-mwq5n" node="node1" I0714 15:14:15.692462 49426 node.go:148] "Pod does not fit on node" pod="kube-system/kube-proxy-rpqw7" node="node1" I0714 15:14:15.692472 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.692493 49426 node.go:148] "Pod does not fit on node" pod="kube-system/kube-proxy-rpqw7" node="node3" I0714 15:14:15.692502 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.692528 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-proxy-rpqw7" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.692582 49426 node.go:145] "Pod fits on node" pod="io/seleznev-test-job-1657800600-w5dz8" node="node1" I0714 15:14:15.692645 49426 node.go:145] "Pod fits on node" pod="io/status-prometheus-6876b6f97b-vbss2" node="node1" I0714 15:14:15.692808 49426 node.go:145] "Pod fits on node" pod="kube-system/prometheus-adapter-657d784c89-snvr9" node="node1" I0714 15:14:15.692843 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/prometheus-adapter-657d784c89-snvr9" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.692906 49426 node.go:148] "Pod does not fit on node" pod="kube-system/unbound-cdvzn" node="node1" I0714 15:14:15.692927 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.692963 49426 node.go:148] "Pod does not fit on node" pod="kube-system/unbound-cdvzn" node="node3" I0714 15:14:15.692977 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.693002 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/unbound-cdvzn" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.693117 49426 node.go:145] "Pod fits on node" pod="kube-system/vpa-exporter-59bfd6d49c-rvjx6" node="node1" I0714 15:14:15.693148 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/vpa-exporter-59bfd6d49c-rvjx6" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.693243 49426 node.go:145] "Pod fits on node" pod="io/test-postgres-db12-pooler-78c564589d-lrww8" node="node1" I0714 15:14:15.693324 49426 node.go:145] "Pod fits on node" pod="io-vm/vmagent-kafka-1" node="node1" I0714 15:14:15.693398 49426 node.go:145] "Pod fits on node" pod="kube-system/cert-manager-legacy-controller-666fbf7899-zp8dr" node="node1" I0714 15:14:15.693420 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/cert-manager-legacy-controller-666fbf7899-zp8dr" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.693477 49426 node.go:145] "Pod fits on node" pod="sonobuoy-2gis/sonobuoy" node="node1" I0714 15:14:15.693497 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="sonobuoy-2gis/sonobuoy" checks="pod does not have any ownerRefs" I0714 15:14:15.693555 49426 node.go:145] "Pod fits on node" pod="io/test-postgres-db12-0" node="node1" I0714 15:14:15.693574 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io/test-postgres-db12-0" checks="pod has a PVC and descheduler is configured to ignore PVC pods" I0714 15:14:15.693631 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-lvmd-0-mw97j" node="node1" I0714 15:14:15.693644 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.693667 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-lvmd-0-mw97j" node="node3" I0714 15:14:15.693679 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.693700 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/topolvm-lvmd-0-mw97j" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.693758 49426 node.go:145] "Pod fits on node" pod="io/dekhtyarev-nginx-7945ff7886-mls68" node="node1" I0714 15:14:15.693817 49426 node.go:145] "Pod fits on node" pod="io-logging/redis-logs-1" node="node1" I0714 15:14:15.693881 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-cadvisor-6b9c9dbc54-plwcb" node="node1" I0714 15:14:15.693905 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-cadvisor-6b9c9dbc54-plwcb" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.693963 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-apps-5f768b48f4-xxbbl" node="node1" I0714 15:14:15.693984 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-apps-5f768b48f4-xxbbl" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.694043 49426 node.go:145] "Pod fits on node" pod="kube-system/kube-prometheus-nodes-868cf59454-brsvd" node="node1" I0714 15:14:15.694064 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/kube-prometheus-nodes-868cf59454-brsvd" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.694119 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-node-ckg5g" node="node1" I0714 15:14:15.694132 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.694155 49426 node.go:148] "Pod does not fit on node" pod="kube-system/topolvm-node-ckg5g" node="node3" I0714 15:14:15.694168 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.694188 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/topolvm-node-ckg5g" checks="[pod is a DaemonSet pod, pod has system critical priority, pod has higher priority than specified priority class threshold, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.694268 49426 node.go:145] "Pod fits on node" pod="io/io-grafana-staging-0" node="node1" I0714 15:14:15.694295 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io/io-grafana-staging-0" checks="pod has a PVC and descheduler is configured to ignore PVC pods" I0714 15:14:15.694348 49426 node.go:148] "Pod does not fit on node" pod="io-logging/elasticsearch-logs-master-0" node="node1" I0714 15:14:15.694357 49426 node.go:150] "insufficient memory" I0714 15:14:15.694374 49426 node.go:148] "Pod does not fit on node" pod="io-logging/elasticsearch-logs-master-0" node="node3" I0714 15:14:15.694383 49426 node.go:150] "pod node selector does not match the node label" I0714 15:14:15.694398 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="io-logging/elasticsearch-logs-master-0" checks="[pod has a PVC and descheduler is configured to ignore PVC pods, pod does not fit on any other node because of nodeSelector(s), Taint(s), or nodes marked as unschedulable]" I0714 15:14:15.694447 49426 node.go:145] "Pod fits on node" pod="kube-system/calico-kube-controllers-7f74bfffbf-p8s67" node="node1" I0714 15:14:15.694462 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/calico-kube-controllers-7f74bfffbf-p8s67" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.694513 49426 node.go:145] "Pod fits on node" pod="kube-system/metrics-server-74bd9c78f5-dbmln" node="node1" I0714 15:14:15.694528 49426 evictions.go:348] "Pod lacks an eviction annotation and fails the following checks" pod="kube-system/metrics-server-74bd9c78f5-dbmln" checks="[pod has system critical priority, pod has higher priority than specified priority class threshold]" I0714 15:14:15.694540 49426 nodeutilization.go:283] "Pods on node" node="node2" allPods=28 nonRemovablePods=19 removablePods=9 I0714 15:14:15.694551 49426 nodeutilization.go:290] "Evicting pods based on priority, if they have same priority, they'll be evicted based on QoS tiers" I0714 15:14:15.694680 49426 evictions.go:161] "Evicted pod in dry run mode" pod="kube-system/csi-rbdplugin-provisioner-8cb6c6b99-pk8rf" reason="LowNodeUtilization" strategy="LowNodeUtilization" node="node2" I0714 15:14:15.694691 49426 nodeutilization.go:323] "Evicted pods" pod="kube-system/csi-rbdplugin-provisioner-8cb6c6b99-pk8rf" err= I0714 15:14:15.694705 49426 nodeutilization.go:348] "Updated node usage" node="node2" CPU=1423 Mem=4922259228 Pods=27 E0714 15:14:15.694723 49426 nodeutilization.go:318] "Error evicting pod" err="Maximum number 1 of evicted pods per \"node2\" node reached" pod="io/load-generator-547dd97745-zlmm2" I0714 15:14:15.694741 49426 nodeutilization.go:294] "Evicted pods from node" node="node2" evictedPods=1 usage=map[cpu:1423m memory:4922259228 pods:27] I0714 15:14:15.694754 49426 lownodeutilization.go:184] "Total number of pods evicted" evictedPods=2 I0714 15:14:15.694766 49426 descheduler.go:304] "Number of evicted pods" totalEvicted=2 I0714 15:14:15.694850 49426 tlsconfig.go:255] "Shutting down DynamicServingCertificateController" I0714 15:14:15.694864 49426 watch.go:183] Stopping fake watcher. I0714 15:14:15.694889 49426 reflector.go:225] Stopping reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:15.694897 49426 reflector.go:225] Stopping reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134 I0714 15:14:15.694936 49426 reflector.go:225] Stopping reflector *v1.Namespace (0s) from k8s.io/client-go/informers/factory.go:134
For me it looks like nodeFit
checks that pod can be scheduled to any other node (not only on underutilized nodes). So descheduler evicts pods from one overutilized node to another overutilized node over and over again.
when if the pod will fit any of the given nodes at https://github.com/kubernetes-sigs/descheduler/blob/master/pkg/descheduler/evictions/evictions.go#L316 the nodes var should contain only the destination nodes, don't you think ?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
We are still running into this issue from time to time (idling nodes with a different nodeSelector resulting in continuous pod evicting within the whole cluster).
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.