Jan Christoph Ebersbach

https://www.identinet.io/ [email protected]

identinet GmbH Bremen I believe in cooperation to be the key to success. At identinet I work to make digital cooperation via secure data exchange available to all organizations.

Results 173 comments of


                                            Jan Christoph Ebersbach

Cluster nodes unreachable after several days

@milosmns thank you for raising the issue. I'm also experiencing these issues and haven't found a solution, yet. I'll get back to you in the next days. A workaround I'm...

Cluster nodes unreachable after several days

I noticed in the k3s' logs the presence of this line: ``` Mar 25 10:00:35 muses-dev-system-2 k3s[100135]: {"level":"warn","ts":"2025-03-25T10:00:35.917959+0100","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"23ff04bc7d082ddd","rtt":"13.233934ms","error":"dial tcp 10.0.1.4:2380: i/o timeout"} ``` From this error onward,...

Cluster nodes unreachable after several days

@mysticaltech @vitobotta have you experienced anything like this before with k3s on Hetzner servers? One specialty of this k3s setup is that it only uses Hetzner's internal network.

Cluster nodes unreachable after several days

@vitobotta thank you for your quick reply!

Cluster nodes unreachable after several days

@milosmns no, SSH won't be affected by this. If SSH isn't reachable, I recommend you reach out to the support before you make any additional changes to the systems.

Cluster nodes unreachable after several days

I just ran into the issue again and found that the default route disappeared from the affected server. Weirdly, the IP address remains in place: ![Image](https://github.com/user-attachments/assets/ef7245fc-fb73-4a1a-8b77-2e404494d630)

Cluster nodes unreachable after several days

The issue might be related to https://github.com/systemd/systemd/issues/28358 since we're also using systemd-networkd and I guess Hetzner migrates virtual machines at will from one system to the next. See also the...

Cluster nodes unreachable after several days

As a workaround, I'll add and install this script (https://github.com/systemd/systemd/issues/28358#issuecomment-1909985912) on my servers that will run `networkctl reconfigure enp7s0` after a system resumes after a suspend. Let's see if it...

Cluster nodes unreachable after several days

Brief status update: apparently, Hetzner runs dhcpd automatically on all nodes. This might be another source for errors. To disable dhcpd these steps need to be performed. It can be...

Cluster nodes unreachable after several days

Status update: so far, the cluster nodes have been stable. The suspend script hasn't been triggered, yet. It looks like the deactivation of dhcpd and the sole use of systemd-networkd...

‹
1
2
...
9
10
11
12
13
14
15
16
17
18
›