hetzner-k3s icon indicating copy to clipboard operation
hetzner-k3s copied to clipboard

Deleting master1 destroys cluster

Open cwilhelm opened this issue 5 months ago • 4 comments

I have a setup of 3 master nodes and 1 worker and have a working cluster with name franz

masters_pool:
  location: fsn1
  instance_type: cx21
  instance_count: 3
worker_node_pools:
- name: test
  location: hel1
  instance_type: cx31
  instance_count: 1

If a delete master1 on hetzner console, and also execute kubectl delete node _franz_-cx21-master. I would expect to keep the same cluster with a replacement of master1 after run of hetzner-k3s create --config ./hetzner-k3s_cluster_config.yaml.

The script hangs in Deploying k3s to worker franz-cx31-pool-test-worker1... step, if i execute kubectl get nodes I get

NAME                 STATUS   ROLES                       AGE   VERSION
franz-cx21-master1   Ready    control-plane,etcd,master   11m   v1.26.12+k3s1

The original cluster is gone :(

Validating configuration......configuration seems valid.

=== Creating infrastructure resources ===
Network already exists, skipping.
Updating firewall...done.
SSH key already exists, skipping.
Placement group franz-masters already exists, skipping.
Placement group franz-test-1 already exists, skipping.
Server franz-cx21-master2 already exists, skipping.
Server franz-cx31-pool-test-worker1 already exists, skipping.
Server franz-cx21-master3 already exists, skipping.
Creating server franz-cx21-master1...
...server franz-cx21-master1 created.
Server franz-cx21-master1 already exists, skipping.
Waiting for successful ssh connectivity with server franz-cx21-master1...
Server franz-cx21-master2 already exists, skipping.
Waiting for successful ssh connectivity with server franz-cx21-master2...
Server franz-cx21-master3 already exists, skipping.
Waiting for successful ssh connectivity with server franz-cx21-master3...
Server franz-cx31-pool-test-worker1 already exists, skipping.
Waiting for successful ssh connectivity with server franz-cx31-pool-test-worker1...
...server franz-cx21-master2 is now up.
...server franz-cx21-master3 is now up.
...server franz-cx31-pool-test-worker1 is now up.
...server franz-cx21-master1 is now up.
Load balancer for API server already exists, skipping.

=== Setting up Kubernetes ===
Deploying k3s to first master franz-cx21-master1...
[INFO]  Using v1.26.12+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.26.12+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.26.12+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
[INFO]  systemd: Starting k3s
Waiting for the control plane to be ready...
Saving the kubeconfig file to /Volumes/Development/IFX/deployments/franz/kubeconfig...
...k3s has been deployed to first master franz-cx21-master1 and the control plane is up.
Deploying k3s to master franz-cx21-master2...
Deploying k3s to master franz-cx21-master3...
[INFO]  Using v1.26.12+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.26.12+k3s1/sha256sum-amd64.txt
[INFO]  Using v1.26.12+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.26.12+k3s1/sha256sum-amd64.txt
[INFO]  Skipping binary downloaded, installed k3s matches hash
[INFO]  Skipping installation of SELinux RPM
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
[INFO]  Skipping binary downloaded, installed k3s matches hash
[INFO]  Skipping installation of SELinux RPM
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
[INFO]  systemd: Starting k3s
[INFO]  systemd: Starting k3s
...k3s has been deployed to master franz-cx21-master3.
...k3s has been deployed to master franz-cx21-master2.
Deploying k3s to worker franz-cx31-pool-test-worker1...
[INFO]  Using v1.26.12+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.26.12+k3s1/sha256sum-amd64.txt
[INFO]  Skipping binary downloaded, installed k3s matches hash
[INFO]  Skipping installation of SELinux RPM
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
[INFO]  systemd: Starting k3s-agent

cwilhelm avatar Jan 31 '24 15:01 cwilhelm

My guess is, that in case of not existing master1 first_master should be set to master2, and therefor the k3s_token should be read from master2

cwilhelm avatar Jan 31 '24 15:01 cwilhelm

Or getter first_master : Hetzner::Server { masters[0] } should be executed before === Creating infrastructure resources === step

cwilhelm avatar Jan 31 '24 16:01 cwilhelm

Sorry for the delay. This is indeed a scenario I hadn't come across. What you suggest might work the first time when master1 is being recreated so we use say master2 as the first master, but then not sure what this would cause when running the create action again after master1 became operational again. It requires some thinking and proper testing.

vitobotta avatar Mar 06 '24 15:03 vitobotta

My solution is https://github.com/vitobotta/hetzner-k3s/pull/312 . I have successfully deleted and recreated master1.

cwilhelm avatar Mar 06 '24 16:03 cwilhelm

@cwilhelm let's continue this discussion in the PR.

vitobotta avatar Apr 12 '24 12:04 vitobotta