charts icon indicating copy to clipboard operation
charts copied to clipboard

[bitnami/etcd] etcd pods are unable to join existing cluster on node drain

Open abhayycs opened this issue 1 year ago • 39 comments

Name and Version

bitnami/etcd-3.5.8

What architecture are you using?

None

What steps will reproduce the bug?

I'm using 3 node Kubernetes cluster and 3 instances of etcd.

When I'm deleting a pod, pod is able to restart. When I'm only draining a node, the pod is not able to re-join the cluster, and unable to start.

Observations: ETCD_INITIAL_CLUSTER_STATE is 'new' when it's starting from zero (first time). CASE-1: When deleting a pod ETCD_INITIAL_CLUSTER_STATE is changing from 'new' to 'existing', and pod is able to start. CASE-2: When draining a node ETCD_INITIAL_CLUSTER_STATE is staying 'new', and newly created pod is unable to join the cluster and unable to restart.

Are you using any custom parameters or values?

I tried with and without persistence.

What is the expected behavior?

pod should start on node drain. And as per my understanding, the 'ETCD_INITIAL_CLUSTER_STATE' should change to 'existing' on node drain as well.

What do you see instead?

etcd pod not starting on node drain.

Additional information

Please let me know, if this behavior is expected or not, and how can I prevent pod restart failure on node drain.

I'm not sure if it will help:

  1. I have tried this in different clusters (rhel & ubuntu based)
  2. Other services are working fine(kafka, zookeeper), network is fine.

abhayycs avatar Apr 14 '23 17:04 abhayycs