etcd-operator
etcd-operator copied to clipboard
Init container hangs indefinitely
After applying the manifest for the example cluster [a 3 node etcd cluster] the init container hangs indefinitely. The last message skip reconciliation: running ([]), pending ([example-etcd-cluster-vnjpsbdfmn])" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
keeps repeating. When I look at the logs for the example-etcd-cluster-vnjpsbdfmn
pod it says, Error from server (BadRequest): container "etcd" in pod "example-etcd-cluster-vnjpsbdfmn" is waiting to start: PodInitializing
. I see no other logs that indicate what the issue might be.
time="2019-04-14T23:21:54Z" level=info msg="creating cluster with Spec:" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg="{" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg=" \"size\": 3," cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg=" \"repository\": \"quay.io/coreos/etcd\"," cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg=" \"version\": \"3.2.13\"" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg="}" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg="cluster created with seed member (example-etcd-cluster-vnjpsbdfmn)" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:21:54Z" level=info msg="start running..." cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
time="2019-04-14T23:22:02Z" level=info msg="skip reconciliation: running ([]), pending ([example-etcd-cluster-vnjpsbdfmn])" cluster-name=example-etcd-cluster cluster-namespace=default pkg=cluster
@jicowan same here. Did you find a solution? Thanks.
@brunowego Not, yet.
I've got the same issue
After change from flannel network to calico, this not happen more. Try switch network.
please investigate events in kubectl cluster, especially from etcd pods, there should be an info why pod is still in initializing
state. Usually it's related to insufficient resources (too high cpu/memory requests per pod), or incorrectly configured storage (for example pod in in zone A while PV was created in zone B, thus you should create new storageclass with volumeBindingMode: WaitForFirstConsumer
).