flannel icon indicating copy to clipboard operation
flannel copied to clipboard

flannel.1 is deleted by `service network restart`, and never recreated again.

Open silenceshell opened this issue 7 years ago • 9 comments

flannel.1 is deleted by service network restart, and never recreated again.

Expected Behavior

Hope that if the flannel.1 interface is deleted, flanneld can recreated it.

Current Behavior

We have met a problem recently: someone else executed service network restart on a node of kubernetes cluster, and found that the flannel.1 interface has been deleted, so containers on this node can not connect to the other nodes.

Possible Solution

Maybe flanneld can check whether the flannel.1 interface is exist or not periodically, and if it is not exist, flanneld can create it again.

Steps to Reproduce (for bugs)

  1. deploy a pod in flannel net.
  2. ping another node from the pod and it should be succeed.
  3. exec service network restart on the node where the pod is scheduled.
  4. check the flannel.1 interface. It should not be exist.
  5. ping another node from the pod again, and it should be failed.

Context

If someone restarts network, flannel will not work properly, and we should restart flanneld and pods on this node again.

Your Environment

  • Flannel version: 0.7.0
  • Backend used (e.g. vxlan or udp): vxlan
  • Etcd version: 2.2.5
  • Kubernetes version (if used): 1.7.5
  • Operating System and version: centos 7.2
  • Link to your project (optional): na

silenceshell avatar Nov 09 '17 07:11 silenceshell

The complete loss of connectivity of one node caused by a net restart is an enhancement?

indiketa avatar Jan 12 '18 22:01 indiketa

If I'm understanding it correctly, someone executed a command that broke networking on the server which was outside the control of flannel. You're asking that flannel monitors the state that it creates to see if a third party process breaks that state. That's an enhancement.

tomdee avatar Jan 13 '18 21:01 tomdee

You are right: Sorry. I opened a PR with a backend health checker proposal.

In my case: flannel iface disappears not as a cause of a "network restart". It's more related with #877, my master device it's a VPN: On a short VPN outage master interface disappears and flannel stops working, the only solution for me es restarting the flannel pod on that node manually.

indiketa avatar Jan 15 '18 10:01 indiketa

Sorry to come back on this quite old issue, and it is too bad that @silenceshell closed that issue as I'm actually facing it right now.

I'm using the Tinc VPN in order to secure the communications between the Kubernetes nodes (VPS nodes) and when the tinc service is restarted, the flannel network interface disappear and never come back again. I discovered that while using sonobuoy as suggested from this cert-manager issue using the command sonobuoy run -p systemd-logs && watch -n 1 sonobuoy status as suggested from that sonobuoy issue.

I tried to delete the flannel pod and see if it re-creates the network interface, but it doesn't. Then I tried to restart Docker (brrrr) and it re-created the flannel network interface and the sonobuoy tests are passing again.

@silenceshell or @tomdee can you please re-open this issue?

zedtux avatar Aug 05 '20 11:08 zedtux

@silenceshell would you mind to reopen this issue please?

zedtux avatar Sep 30 '20 09:09 zedtux

Thank you so much!

zedtux avatar Sep 30 '20 09:09 zedtux

@zedtux :)

silenceshell avatar Sep 30 '20 09:09 silenceshell

@tomdee is there any near future plans regarding this issue please?

zedtux avatar Dec 02 '20 14:12 zedtux

I have found another way to fix this problem. We can use Kubernetes to run the flannel container and add a liveness probe by exec command like "ifconfig flannel.1". When the kubelet detected the interface inexistence, it would recreate the Pod flannel and the flannel.1 interface will be recreated.

kkfinkkfin avatar Jun 16 '21 06:06 kkfinkkfin

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 25 '23 22:01 stale[bot]