kubeone icon indicating copy to clipboard operation
kubeone copied to clipboard

Kubeone continues despite failed healthz check

Open judge-red opened this issue 1 year ago • 3 comments

What happened?

I'm trying to set up a K1 cluster in a new environment. I'm still fine tuning the infra and particularly the firewalls (OpenStack security groups). Access to the CP nodes through the LB was broken when I ran kubeone apply to install a Kubernetes cluster.

Luckily, K1 seems to run a healthz check before trying to do anything on the cluster. Unluckily, after the healthz check fails, it just continues anyway. And then it fails to create a resource but still keeps on going.

Also, fixing the firewall issue and letting kubeone apply run again wasn't successful, I had to replace the VMs and start fresh.

Expected behavior

K1 notices that the healthz check fails and doesn't continue. K1 notices that creating a resource failed and doesn't continue.

Also, K1 should probably be able to recover from this in a subsequent run.

How to reproduce the issue?

Yea, that's not going to be easy, I guess. As I described above, I had a custom TF-based OpenStack setup. Everything worked as expected, except accessing port 6443 through the LB. I think the LB accepted the connection, but the connection between LB and VM was blocked. Access to port 6443 on the LB without going through the LB worked.

What KubeOne version are you using?

1.7.2

Provide your KubeOneCluster manifest here (if applicable)

Don't think it matters, otherwise let me know. (I need to manually do some of the steps our pipeline does to get this manifest.)

What cloud provider are you running on?

OpenStack

What operating system are you running in your cluster?

Ubuntu 22.04

Additional information

I'll add the logs of the initial "install run" and the "subsequent run". I eventually cancelled both job runs, equivalent to ctrl+c.

k1-install-run.log k1-subsequent-run.log

judge-red avatar Jan 19 '24 12:01 judge-red