cilium-etcd-operator icon indicating copy to clipboard operation
cilium-etcd-operator copied to clipboard

etcd-operator fails to start

Open githubcdr opened this issue 5 years ago • 11 comments

Hi,

I ran into an issue where the etcd-operator fails to bring up the etcd cluster, this happened after a crash of all my Kubernetes nodes.

The etcd-operator tries to bootstrap from scratch and keeps doing so, never reaching

time="2019-02-05T20:26:07Z" level=info msg="Deploying etcd-operator deployment..."
time="2019-02-05T20:26:07Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:08Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:09Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:10Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:11Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:12Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:13Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:14Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:15Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:16Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:17Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:18Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:19Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:20Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:21Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:22Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:23Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:24Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:25Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:26Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:27Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:28Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:29Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:30Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:31Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:32Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:33Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:34Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:35Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:26:35Z" level=info msg="Done! Re-creating etcd-operator deployment..."
time="2019-02-05T20:26:35Z" level=info msg="Done!"
time="2019-02-05T20:26:35Z" level=info msg="Deploying Cilium etcd cluster CR..."
time="2019-02-05T20:26:35Z" level=info msg=Done
time="2019-02-05T20:26:35Z" level=info msg="Sleeping for 5m0s to allow cluster to come up..."
time="2019-02-05T20:31:35Z" level=info msg="Starting to monitor cluster health..."
time="2019-02-05T20:31:37Z" level=info msg="Deploying etcd-operator CRD..."
time="2019-02-05T20:31:37Z" level=info msg="Done!"
time="2019-02-05T20:31:37Z" level=info msg="Deploying etcd-operator deployment..."
time="2019-02-05T20:31:37Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:31:39Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:31:40Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:31:41Z" level=info msg="Waiting for previous etcd-operator deployment to be removed..."
time="2019-02-05T20:31:41Z" level=info msg="Done! Re-creating etcd-operator deployment..."
time="2019-02-05T20:31:41Z" level=info msg="Done!"
time="2019-02-05T20:31:41Z" level=info msg="No running etcd pod found. Bootstrapping from scratch..."
time="2019-02-05T20:31:41Z" level=info msg="Deploying Cilium etcd cluster CR..."
time="2019-02-05T20:31:41Z" level=info msg=Done
time="2019-02-05T20:31:41Z" level=info msg="Sleeping for 5m0s to allow cluster to come up..."

I ran cleanup.sh and did a re-deploy, but this issue stays the same.

Is this a bug or am I missing something here?

githubcdr avatar Feb 05 '19 20:02 githubcdr