charts
charts copied to clipboard
[bitnami/etcd] Etcd upgrade issue
Name and Version
bitnami/etcd
What architecture are you using?
amd64
What steps will reproduce the bug?
I have a 3 node K8s Cluster, where I'm installing Etcd-cluster with persistence enabled. It is found that sometimes, when upgrading the cluster with new etcd-changes, one of the instances goes to CrashLoopBackOff state.
NAME READY STATUS RESTARTS AGE
voltha voltha-etcd-cluster-client-0 1/1 Running 0 14h
voltha voltha-etcd-cluster-client-1 1/1 Running 0 14h
voltha voltha-etcd-cluster-client-2 0/1 CrashLoopBackOff 173 (4m31s ago) 14h
Are you using any custom parameters or values?
persistence is enabled
What is the expected behavior?
No response
What do you see instead?
Since each etcd instance is associated with a PersitenceVolumeClaim & a PersistenceVolume. So, to recover from this state I have to delete the PV associated with the voltha-etcd-cluster-client-2
instance and restart the voltha-etcd-cluster-client-2
pod.
My question is: Is it okay if I delete the PeristenceVolume with surety that no data is lost and data is up-to-date. I don't want to end-up in a situation where I lose any data or if the data is not up-to-date.
Any help would be greatly appreciated.
Additional information
No response
Hi!
Which is the error that appears when running into that CrashLoopBackoff state? Do the logs show anything meaningful?
Hi Javier,
Please find the logs:
$ kubectl logs -f voltha-etcd-cluster-client-2 -n voltha
etcd 05:28:31.88
etcd 05:28:31.88 Welcome to the Bitnami etcd container
etcd 05:28:31.89 Subscribe to project updates by watching
https://github.com/bitnami/containers
etcd 05:28:31.89 Submit issues and feature requests at
https://github.com/bitnami/containers/issues
etcd 05:28:31.89
etcd 05:28:31.89 INFO ==> ** Starting etcd setup **
etcd 05:28:31.91 INFO ==> Validating settings in ETCD_* env vars..
etcd 05:28:31.92 WARN ==> You set the environment variable
ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in
a production environment.
etcd 05:28:31.92 INFO ==> Initializing etcd
etcd 05:28:31.93 INFO ==> Generating etcd config file using env variables
etcd 05:28:31.94 INFO ==> Detected data from previous deployments
etcd 05:28:32.08 INFO ==> Updating member in existing cluster
***@***.***/retry_interceptor.go:62","msg":"retrying
of unary invoker
failed","target":"etcd-endpoints://0xc0001f0000/voltha-etcd-cluster-client-0.voltha-etcd-cluster-client-headless.voltha.svc.cluster.local:2379","attempt":0,"error":"rpc
error: code = NotFound desc = etcdserver: member not found"}
Error: etcdserver: member not found
Thanks,
Abhay
Hi @abhayycs
There were a couple of similar previous issues to your scenario #6251 and #10009 (although they apply for scaling too). Could you please take a look to check if your situation is the same and if the suggestions for those cases may help?
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.