flannel flannel.1 is deleted by `service network restart`, and never recreated again.

flannel.1 is deleted by service network restart, and never recreated again.

Expected Behavior

Hope that if the flannel.1 interface is deleted, flanneld can recreated it.

Current Behavior

We have met a problem recently: someone else executed service network restart on a node of kubernetes cluster, and found that the flannel.1 interface has been deleted, so containers on this node can not connect to the other nodes.

Possible Solution

Maybe flanneld can check whether the flannel.1 interface is exist or not periodically, and if it is not exist, flanneld can create it again.

Steps to Reproduce (for bugs)

deploy a pod in flannel net.
ping another node from the pod and it should be succeed.
exec service network restart on the node where the pod is scheduled.
check the flannel.1 interface. It should not be exist.
ping another node from the pod again, and it should be failed.

Context

If someone restarts network, flannel will not work properly, and we should restart flanneld and pods on this node again.

Your Environment

Flannel version: 0.7.0
Backend used (e.g. vxlan or udp): vxlan
Etcd version: 2.2.5
Kubernetes version (if used): 1.7.5
Operating System and version: centos 7.2
Link to your project (optional): na

Nov 09 '17 07:11 silenceshell

The complete loss of connectivity of one node caused by a net restart is an enhancement?

Jan 12 '18 22:01 indiketa

If I'm understanding it correctly, someone executed a command that broke networking on the server which was outside the control of flannel. You're asking that flannel monitors the state that it creates to see if a third party process breaks that state. That's an enhancement.

Jan 13 '18 21:01 tomdee

You are right: Sorry. I opened a PR with a backend health checker proposal.

In my case: flannel iface disappears not as a cause of a "network restart". It's more related with #877, my master device it's a VPN: On a short VPN outage master interface disappears and flannel stops working, the only solution for me es restarting the flannel pod on that node manually.

Jan 15 '18 10:01 indiketa

Sorry to come back on this quite old issue, and it is too bad that @silenceshell closed that issue as I'm actually facing it right now.

I'm using the Tinc VPN in order to secure the communications between the Kubernetes nodes (VPS nodes) and when the tinc service is restarted, the flannel network interface disappear and never come back again. I discovered that while using sonobuoy as suggested from this cert-manager issue using the command sonobuoy run -p systemd-logs && watch -n 1 sonobuoy status as suggested from that sonobuoy issue.

I tried to delete the flannel pod and see if it re-creates the network interface, but it doesn't. Then I tried to restart Docker (brrrr) and it re-created the flannel network interface and the sonobuoy tests are passing again.

@silenceshell or @tomdee can you please re-open this issue?

Aug 05 '20 11:08 zedtux

@silenceshell would you mind to reopen this issue please?

Sep 30 '20 09:09 zedtux

Thank you so much!

Sep 30 '20 09:09 zedtux

@zedtux :)

Sep 30 '20 09:09 silenceshell

@tomdee is there any near future plans regarding this issue please?

Dec 02 '20 14:12 zedtux

I have found another way to fix this problem. We can use Kubernetes to run the flannel container and add a liveness probe by exec command like "ifconfig flannel.1". When the kubelet detected the interface inexistence, it would recreate the Pod flannel and the flannel.1 interface will be recreated.

Jun 16 '21 06:06 kkfinkkfin

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Jan 25 '23 22:01 stale[bot]

flannel flannel copied to clipboard

flannel.1 is deleted by `service network restart`, and never recreated again.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

flannel
flannel copied to clipboard