flannel
flannel copied to clipboard
flannel.1 is deleted by `service network restart`, and never recreated again.
flannel.1
is deleted by service network restart
, and never recreated again.
Expected Behavior
Hope that if the flannel.1
interface is deleted, flanneld
can recreated it.
Current Behavior
We have met a problem recently: someone else executed service network restart
on a node of kubernetes cluster, and found that the flannel.1
interface has been deleted, so containers on this node can not connect to the other nodes.
Possible Solution
Maybe flanneld
can check whether the flannel.1
interface is exist or not periodically, and if it is not exist, flanneld
can create it again.
Steps to Reproduce (for bugs)
- deploy a pod in flannel net.
- ping another node from the pod and it should be succeed.
- exec
service network restart
on the node where the pod is scheduled. - check the
flannel.1
interface. It should not be exist. - ping another node from the pod again, and it should be failed.
Context
If someone restarts network, flannel will not work properly, and we should restart flanneld
and pods on this node again.
Your Environment
- Flannel version: 0.7.0
- Backend used (e.g. vxlan or udp): vxlan
- Etcd version: 2.2.5
- Kubernetes version (if used): 1.7.5
- Operating System and version: centos 7.2
- Link to your project (optional): na
The complete loss of connectivity of one node caused by a net restart is an enhancement?
If I'm understanding it correctly, someone executed a command that broke networking on the server which was outside the control of flannel. You're asking that flannel monitors the state that it creates to see if a third party process breaks that state. That's an enhancement.
You are right: Sorry. I opened a PR with a backend health checker proposal.
In my case: flannel iface disappears not as a cause of a "network restart". It's more related with #877, my master device it's a VPN: On a short VPN outage master interface disappears and flannel stops working, the only solution for me es restarting the flannel pod on that node manually.
Sorry to come back on this quite old issue, and it is too bad that @silenceshell closed that issue as I'm actually facing it right now.
I'm using the Tinc VPN in order to secure the communications between the Kubernetes nodes (VPS nodes) and when the tinc service is restarted, the flannel network interface disappear and never come back again.
I discovered that while using sonobuoy as suggested from this cert-manager issue using the command sonobuoy run -p systemd-logs && watch -n 1 sonobuoy status
as suggested from that sonobuoy issue.
I tried to delete the flannel pod and see if it re-creates the network interface, but it doesn't. Then I tried to restart Docker (brrrr) and it re-created the flannel network interface and the sonobuoy tests are passing again.
@silenceshell or @tomdee can you please re-open this issue?
@silenceshell would you mind to reopen this issue please?
Thank you so much!
@zedtux :)
@tomdee is there any near future plans regarding this issue please?
I have found another way to fix this problem. We can use Kubernetes to run the flannel container and add a liveness probe by exec command like "ifconfig flannel.1". When the kubelet detected the interface inexistence, it would recreate the Pod flannel and the flannel.1 interface will be recreated.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.