descheduler How to have "RemoveFailedPods" evict pods in "Error" state reason while Pod status phase is in "running" ?

What version of descheduler are you using?

descheduler version: v0.23.1

What k8s version are you using (kubectl version)?

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

What did you do? We have mongo pod which has pod.Status.Phase as "Running", however one of its container containerStatus.State is "Terminated" and reason as "Error".

NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES db-mongo-ram db-mongo-ram-configsvr-2 0/2 Error 4 24h X.X.X.X cni-k8-node-13-io69sc-9gu39tp7t1m52511

Jul 08 '22 05:07 sudmis

hi @sudmis in your case, you have pod with more than one container, and you would like configure the descheduler to evict pod if one of the container phase is not in running, is it correct?

Jul 08 '22 06:07 JaneLiuL

Hi @JaneLiuL Here is my scenario

-- I have one pod with 1 init container and 2 containers. Pod has Status: Running -- Init Container which is State: Terminated, Reason: Completed, Exit Code: 0 -- Container first (mongodb) which is State: Terminated, Reason: Error, Exit Code: 255 -- Container second (xyz) which is State: Terminated, Reason: Error, Exit Code: 255

So I want to evict this pod, as it will be rescheduled/restarted, which resolve the issue and brings back the containers into running state.

Currently I see descheduler logic checks pod.Status.Phase == v1.PodFailed . However in my case that's not the scenario. Is there a way, I can initiate eviction of this pod using descheduler ?

Jul 08 '22 08:07 sudmis

thanks @sudmis could you get the pod and show me the yaml? the pod READY is 2/3 right? kubectl get pod xxx -o yaml now for descheduler we do check pod.Status.Phase == v1.PodFailed and evict pod in RemoveFailedPods stategy. till now we don't check the sidecar or initContainer for condition to evict the pod. what do you suggest for this improvement?

Jul 08 '22 11:07 JaneLiuL

thanks. i check your log, the pod status is running, it will not be evict for the RemoveFailedPods strategy . but it seems your pod keep restart, if meet requirement it will be evict by RemovePodsHavingTooManyRestarts stategy.

but i still open here that may be we can discuss is how to evict pod if container fail but pod running in descheduler.

Jul 08 '22 11:07 JaneLiuL

@JaneLiuL I do not have kubectl get pod detail at this moment. But here is the kubectl describe pod details, when the mentioned behavior occurred. Hope it will help.

RemovePodsHavingTooManyRestarts may not help here. Because, currently in my case, if the pod is restated all the container are back in running state.

So what do you suggest should be the next step to discuss?

NAMESPACE             NAME                                                               READY   STATUS      RESTARTS   AGE     IP                NODE                                       NOMINATED NODE   READINESS GATES
fed-db-mongo-ram      fed-db-mongo-ram-configsvr-2                                       0/2     Error       4          24h     192.168.39.219    cni-k8-node-13-io69sc-9gu39tp7t1m52511     <none>           <none>

[root@cni-k8-master-11-io69sc-9gu39tp7t1m52511 node13]# cat fed-db-mongo-ram-configsvr-2_describe.log
Name:                 fed-db-mongo-ram-configsvr-2
Namespace:            fed-db-mongo-ram
Priority:             700000
Priority Class Name:  uc-priority-2
Node:                 cni-k8-node-13-io69sc-9gu39tp7t1m52511/10.72.10.13
Start Time:           Thu, 07 Apr 2022 10:31:45 -0400
Labels:               app.kubernetes.io/component=database
                      app.kubernetes.io/instance=fed-db-mongo-ram
                      app.kubernetes.io/managed-by=kubedb.com
                      app.kubernetes.io/name=mongodbs.kubedb.com
                      controller-revision-hash=fed-db-mongo-ram-configsvr-5b6d758d6b
                      mongodb.kubedb.com/node.config=fed-db-mongo-ram-configsvr
                      statefulset.kubernetes.io/pod-name=fed-db-mongo-ram-configsvr-2
Annotations:          cni.projectcalico.org/podIP: 192.168.39.219/32
                      cni.projectcalico.org/podIPs: 192.168.39.219/32,fdf7:74dc:298e:27e1:36e5:50ef:fe83:6dd4/128
                      k8s.v1.cni.cncf.io/network-status:
                        [{
                            "name": "",
                            "ips": [
                                "192.168.39.219",
                                "fdf7:74dc:298e:27e1:36e5:50ef:fe83:6dd4"
                            ],
                            "default": true,
                            "dns": {}
                        }]
                      k8s.v1.cni.cncf.io/networks-status:
                        [{
                            "name": "",
                            "ips": [
                                "192.168.39.219",
                                "fdf7:74dc:298e:27e1:36e5:50ef:fe83:6dd4"
                            ],
                            "default": true,
                            "dns": {}
                        }]
                      kubernetes.io/limit-ranger: LimitRanger plugin set: cpu, memory request for container exporter; cpu, memory limit for container exporter
                      kubernetes.io/psp: mongodb-db
Status:               Running
IP:                   192.168.39.219
IPs:
  IP:           192.168.39.219
  IP:           fdf7:74dc:298e:27e1:36e5:50ef:fe83:6dd4
Controlled By:  StatefulSet/fed-db-mongo-ram-configsvr
Init Containers:
  copy-config:
    Container ID:  docker://b0d09dca2b9eff5bc388e06bafa496d98c1b35a2cc50cffedb0bdb3f71c1bb6c
    Image:         cnreg-rel:5000/cna-db_percona_mongo-mongodb_init_4_1_5-rel_3_1_1:3.2.0-6
    Image ID:      docker-pullable://cnreg-rel:5000/cna-db_percona_mongo-mongodb_init_4_1_5-rel_3_1_1@sha256:7f2902a7dace37a84068331339c39053b58aabc440517f6eb8ebdb545c87c9c1
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c

                        echo "running install.sh"
                        /scripts/install.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 08 Apr 2022 10:08:34 -0400
      Finished:     Fri, 08 Apr 2022 10:08:34 -0400
    Ready:          True
    Restart Count:  3
    Limits:
      cpu:     16
      memory:  8Gi
    Requests:
      cpu:     500m
      memory:  10Mi
    Environment:
      SSL_MODE:  disabled
    Mounts:
      /configdb-readonly from configdir (rw)
      /data/configdb from config (rw)
      /init-scripts from init-scripts (rw)
      /keydir-readonly from keydir (rw)
      /var/run/mongodb/tls from certdir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9sv4d (ro)
Containers:
  mongodb:
    Container ID:  docker://efddbc195277ba8f0c7d8ffe4deb2ee0e57371117a0e8adabd7c89beaf4a8509
    Image:         cnreg-rel:5000/cna-db_percona_mongo-percona_server_mongodb:latest
    Image ID:      docker-pullable://cnreg-rel:5000/cna-db_percona_mongo-percona_server_mongodb@sha256:49a79b20cb9ae2707324ff88878f0d08747821c9448144baf889f1e80b6ccc72
    Port:          27017/TCP
    Host Port:     0/TCP
    Command:
      mongod
    Args:
      --dbpath=/data/db
      --auth
      --bind_ip_all
      --port=27017
      --configsvr
      --replSet=cnfRepSet
      --clusterAuthMode=keyFile
      --keyFile=/data/configdb/key.txt
      --ipv6
      --sslMode=disabled
      --transitionToAuth
      --listenBacklog=4096
      --config=/data/configdb/mongod.conf
    State:          Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Thu, 07 Apr 2022 19:15:15 -0400
      Finished:     Fri, 08 Apr 2022 10:07:55 -0400
    Ready:          False
    Restart Count:  2
    Limits:
      cpu:     16
      memory:  8Gi
    Requests:
      cpu:     500m
      memory:  10Mi
    Liveness:  exec [bash -c set -x; if [[ $(mongo admin --host=localhost  --username=$MONGO_INITDB_ROOT_USERNAME --password=$MONGO_INITDB_ROOT_PASSWORD --authenticationDatabase=admin --quiet --eval "db.adminCommand('ping').ok" ) -eq "1" ]]; then
          exit 0
        fi
        exit 1] delay=0s timeout=5s period=10s #success=1 #failure=3
    Readiness:  exec [bash -c set -x; if [[ $(mongo admin --host=localhost  --username=$MONGO_INITDB_ROOT_USERNAME --password=$MONGO_INITDB_ROOT_PASSWORD --authenticationDatabase=admin --quiet --eval "db.adminCommand('ping').ok" ) -eq "1" ]]; then
          exit 0
        fi
        exit 1] delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:               fed-db-mongo-ram (v1:metadata.namespace)
      REPLICA_SET:                 cnfRepSet
      AUTH:                        true
      SSL_MODE:                    disabled
      CLUSTER_AUTH_MODE:           keyFile
      MONGO_INITDB_ROOT_USERNAME:  <set to the key XXX in secret XXX >  Optional: false
      MONGO_INITDB_ROOT_PASSWORD:  <set to the key XXX in secret XXX>  Optional: false
      POD_NAME:                    fed-db-mongo-ram-configsvr-2 (v1:metadata.name)
    Mounts:
      /data/configdb from config (rw)
      /data/db from datadir (rw)
      /init-scripts from init-scripts (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9sv4d (ro)
      /work-dir from workdir (rw)
  exporter:
    Container ID:  docker://489118f9e48f362235ffded4e6149d304b98f4a72227516b623b9430e2f32a0a
    Image:         cnreg-rel:5000/cna-db_percona_mongo-mongodb_exporter-rel_3_1_1:3.2.0-6
    Image ID:      docker-pullable://cnreg-rel:5000/cna-db_percona_mongo-mongodb_exporter-rel_3_1_1@sha256:37ab6bbcbe5025518f94a8545e8fff198f7d65ffd95a059e028afccf0bf68535
    Port:          56790/TCP
    Host Port:     0/TCP
    Args:
      --mongodb.uri=mongodb://$(MONGO_INITDB_ROOT_USERNAME):$(MONGO_INITDB_ROOT_PASSWORD)@localhost:27017/admin
      --web.listen-address=:56790
      --web.telemetry-path=/metrics
    State:          Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Thu, 07 Apr 2022 19:16:00 -0400
      Finished:     Fri, 08 Apr 2022 10:07:55 -0400
    Ready:          False
    Restart Count:  2
    Limits:
      cpu:     2
      memory:  2Gi
    Requests:
      cpu:     10m
      memory:  10Mi
    Environment:
      MONGO_INITDB_ROOT_USERNAME:  <set to the XXX key in secret XXX >  Optional: false
      MONGO_INITDB_ROOT_PASSWORD:  <set to the key XXX in secret XXX >  Optional: false
      POD_NAME:                    fed-db-mongo-ram-configsvr-2 (v1:metadata.name)
    Mounts:
      /var/run/mongodb/tls from certdir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9sv4d (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  workdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  init-scripts:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  certdir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  keydir:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  fed-db-mongo-ram-key
    Optional:    false
  config:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  datadir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  1Gi
  configdir:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  configsvr-config
    Optional:    false
  kube-api-access-9sv4d:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 170s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 170s
Events:
  Type     Reason          Age                From             Message
  ----     ------          ----               ----             -------
  Warning  NodeNotReady    39m (x3 over 16h)  node-controller  Node is not ready
  Normal   SandboxChanged  38m                kubelet          Pod sandbox changed, it will be killed and re-created.
  Normal   AddedInterface  38m                multus           Add eth0 [192.168.39.219/32 fdf7:74dc:298e:27e1:36e5:50ef:fe83:6dd4/128]
  Normal   Pulled          38m                kubelet          Container image "cnreg-rel:5000/cna-db_percona_mongo-mongodb_init_4_1_5-rel_3_1_1:3.2.0-6" already present on machine
  Normal   Created         38m                kubelet          Created container copy-config
  Normal   Started         38m                kubelet          Started container copy-config
  Normal   Pulled          38m                kubelet          Container image "cnreg-rel:5000/cna-db_percona_mongo-percona_server_mongodb:latest" already present on machine
  Normal   Created         38m                kubelet          Created container mongodb
  Normal   Started         38m                kubelet          Started container mongodb
[root@cni-k8-master-11-io69sc-9gu39tp7t1m52511 node13]#

Jul 08 '22 11:07 sudmis

@sudmis could you show me your descheduler policy config file? it should been evicted for RemovePodsHavingTooManyRestarts strategy since your Restart Count is so high.

Jul 08 '22 11:07 JaneLiuL

@JaneLiuL we have removePodsHavingTooManyRestarts strategy with podRestartThreshold: 20. In the logs if you see restart count is 4. Do you think if that number is to high?

Jul 08 '22 11:07 sudmis

yes. if the podRestartThreshold: 20 then your pod have to wait unit restart count equal with or more than 20.
if containers fail, k8s should restart the pod till it all ready, so finally it will trigger the RemovePodsHavingTooManyRestarts

Jul 08 '22 11:07 JaneLiuL

@JaneLiuL If we keep podRestartThreshold as low as to 4, since we have few pods with more than 4 containers and assuming few restarts, then it will lead to undesired results. Do you have anyother suggestion?

Jul 11 '22 07:07 sudmis

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Oct 09 '22 08:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Nov 08 '22 09:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Dec 08 '22 09:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Dec 08 '22 09:12 k8s-ci-robot

descheduler descheduler copied to clipboard

How to have "RemoveFailedPods" evict pods in "Error" state reason while Pod status phase is in "running" ?

descheduler
descheduler copied to clipboard