flannel
flannel copied to clipboard
Missing routes with many nodes on vxlan
Expected Behavior
When adding a new instance, it should always add all routes to all existing instances.
Current Behavior
When there are many nodes (> 30
), there are occasionally missing routes between some instances.
For each missing route we get this error:
vxlan_network.go:145] AddFDB failed: no buffer space available
Missing route means, that one entry like this is missing on ip route
:
10.10.27.0/24 via 10.10.27.0 dev flannel.1 onlink
Possible Solution
Besides fixing the underlying problem on the OS or network settings it might be a good idea to retry such things or even to let flannel fail completely (see Context).
Steps to Reproduce
- Set up a autoscaling group with instances which use flannel.
- Scale up to 50 nodes without ramping up. I am not sure if the parallel booting it a problem here.
- Run
journalctl -u flanneld | grep AddFDB
on each instance and see some errors. There are around 4 missing routes on that scale.
systemd unit
$ systemctl cat flanneld
# /usr/lib/systemd/system/flanneld.service
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=etcd.service etcd2.service etcd-member.service
Requires=flannel-docker-opts.service
[Service]
Type=notify
Restart=always
RestartSec=10s
TimeoutStartSec=300
LimitNOFILE=40000
LimitNPROC=1048576
Environment="FLANNEL_IMAGE_TAG=v0.9.0"
Environment="FLANNEL_OPTS=--ip-masq=true"
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid"
EnvironmentFile=-/run/flannel/options.env
ExecStartPre=/sbin/modprobe ip_tables
ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos /run/flannel
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
ExecStart=/usr/lib/coreos/flannel-wrapper $FLANNEL_OPTS
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/flanneld.service.d/40-pod-network.conf
[Service]
Environment="FLANNELD_ETCD_ENDPOINTS=http://0.etcd.k8s.rebuy.loc:2379,http://1.etcd.k8s.rebuy.loc:2379,http://2.etcd.k8s.rebuy.loc:2379"
ExecStartPre=/usr/bin/etcdctl --endpoints=http://0.etcd.k8s.rebuy.loc:2379,http://1.etcd.k8s.rebuy.loc:2379,http://2.etcd.k8s.rebuy.loc:2379 set /coreos.com/network/config \
'{"Network":"10.10.0.0/16", "Backend": {"Type": "vxlan"}}'
logs
$ journalctl -u flanneld | cat
-- Logs begin at Tue 2018-03-06 09:18:56 CET, end at Tue 2018-03-06 10:46:55 CET. --
Mar 06 09:19:26 localhost systemd[1]: Starting flannel - Network fabric for containers (System Application Container)...
Mar 06 09:19:28 ip-172-20-202-32.eu-west-1.compute.internal rkt[859]: rm: unable to resolve UUID from file: open /var/lib/coreos/flannel-wrapper.uuid: no such file or directory
Mar 06 09:19:28 ip-172-20-202-32.eu-west-1.compute.internal rkt[859]: rm: failed to remove one or more pods
Mar 06 09:19:29 ip-172-20-202-32.eu-west-1.compute.internal etcdctl[932]: {"Network":"10.10.0.0/16", "Backend": {"Type": "vxlan"}}
Mar 06 09:19:29 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid --trust-keys-from-https --mount volume=coreos-notify,target=/run/systemd/notify --volume coreos-notify,kind=host,source=/run/systemd/notify --set-env=NOTIFY_SOCKET=/run/systemd/notify --net=host --volume coreos-run-flannel,kind=host,source=/run/flannel,readOnly=false --volume coreos-etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true --volume coreos-usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume coreos-etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume coreos-etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=coreos-run-flannel,target=/run/flannel --mount volume=coreos-etc-ssl-certs,target=/etc/ssl/certs --mount volume=coreos-usr-share-certs,target=/usr/share/ca-certificates --mount volume=coreos-etc-hosts,target=/etc/hosts --mount volume=coreos-etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.9.0 -- --ip-masq=true
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: pubkey: prefix: "quay.io/coreos/flannel"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: key: "https://quay.io/aci-signing-key"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: gpg key fingerprint is: BFF3 13CD AA56 0B16 A898 7B8F 72AB F5F6 799D 33BC
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/coreos/flannel" without fingerprint review.
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Added key for prefix "quay.io/coreos/flannel" at "/etc/rkt/trustedkeys/prefix.d/quay.io/coreos/flannel/bff313cdaa560b16a8987b8f72abf5f6799d33bc"
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature: 0 B/473 B
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature: 473 B/473 B
Mar 06 09:19:33 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading signature: 473 B/473 B
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI: 0 B/18.4 MB
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI: 8.19 KB/18.4 MB
Mar 06 09:19:34 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Downloading ACI: 18.4 MB/18.4 MB
Mar 06 09:19:35 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: image: signature verified:
Mar 06 09:19:35 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559691 950 main.go:470] Determining IP address of default interface
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559900 950 main.go:483] Using interface with name eth0 and address 172.20.202.32
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559912 950 main.go:500] Defaulting external address to interface address (172.20.202.32)
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559977 950 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: 0.0.0.0/0
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.559984 950 main.go:238] Installing signal handlers
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.567211 950 main.go:348] Found network config - Backend type: vxlan
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.567251 950 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.919205 950 local_manager.go:234] Picking subnet in range 10.10.1.0 ... 10.10.255.0
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.923001 950 local_manager.go:220] Allocated lease (10.10.122.0/24) to current node (172.20.202.32)
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939560 950 main.go:295] Wrote subnet file to /run/flannel/subnet.env
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939578 950 main.go:299] Running backend.
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.939878 950 vxlan_network.go:56] watching for new subnet leases
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal systemd[1]: Started flannel - Network fabric for containers (System Application Container).
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:37.943885 950 main.go:391] Waiting for 23h0m0.085901025s to renew lease
Mar 06 09:19:37 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: E0306 08:19:37.952090 950 vxlan_network.go:145] AddFDB failed: no buffer space available
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.665323 950 ipmasq.go:75] Some iptables rules are missing; deleting and recreating rules
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.665356 950 ipmasq.go:97] Deleting iptables rule: -s 10.10.0.0/16 -d 10.10.0.0/16 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.666565 950 ipmasq.go:97] Deleting iptables rule: -s 10.10.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.667897 950 ipmasq.go:97] Deleting iptables rule: ! -s 10.10.0.0/16 -d 10.10.122.0/24 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.668949 950 ipmasq.go:97] Deleting iptables rule: ! -s 10.10.0.0/16 -d 10.10.0.0/16 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.670394 950 ipmasq.go:85] Adding iptables rule: -s 10.10.0.0/16 -d 10.10.0.0/16 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.672491 950 ipmasq.go:85] Adding iptables rule: -s 10.10.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.758989 950 ipmasq.go:85] Adding iptables rule: ! -s 10.10.0.0/16 -d 10.10.122.0/24 -j RETURN
Mar 06 09:19:38 ip-172-20-202-32.eu-west-1.compute.internal flannel-wrapper[950]: I0306 08:19:38.761328 950 ipmasq.go:85] Adding iptables rule: ! -s 10.10.0.0/16 -d 10.10.0.0/16 -j MASQUERADE
We tried to adjust some sysctl settings, but none of them worked:
net.ipv4.tcp_rmem = 10240 87380 12582912
net.ipv4.tcp_wmem = 10240 87380 12582912
net.ipv4.tcp_rmem = 102400 873800 125829120
net.ipv4.tcp_wmem = 102400 873800 125829120
net.core.wmem_max = 125829120
net.core.rmem_max = 125829120
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.core.netdev_max_backlog = 5000
net.ipv4.udp_mem = 102400 873800 125829120
net.ipv4.udp_mem = 102400 873800 125829120
net.ipv4.udp_rmem_min = 10240
net.ipv4.udp_wmem_min = 10240
Context
We are scaling our Kubernetes Cluster inside of an AWS ASG. When adding new nodes, we rely on a working network. Even a not working network would be better than some missing nodes, because the cluster might behave flaky in rare cases and it is not evident where this comes from. We had for example DNS problems. A very small subset of our applications had a high error rate on resolving domain names and we didn't know where this came from for a long time. Now we know, that this was caused by a missing route between the instance where the faulty application ran on and the instance where the DNS server ran on.
Currently we need to manually grep the journal logs and replace broken instances, because it is hard to automatically figuring out, whether a route is missing.
Your Environment
- Flannel version:
v0.9.0
andv0.10.0
- Backend used (e.g. vxlan or udp):
vxlan
(with and withoutDirectRouting
) - Etcd version:
3.2.11
- Kubernetes version (if used):
v1.8.5+coreos.0
- Operating System and version:
Container Linux by CoreOS 1576.5.0 (Ladybug)
Looks like this might be related to https://github.com/coreos/flannel/issues/779
We already tried the proposed solution, but it didn't work.
IIUC correctly we have to change net.core.rmem_max
and net.core.wmem_max
. These are the values on a failed node:
# sysctl -a | grep [wr]mem_max
net.core.rmem_max = 125829120
net.core.wmem_max = 125829120
related to "get/set receive buffer size" in netlink: https://github.com/vishvananda/netlink/commit/ef84ebb87b6680303340b51684666c363c79763e
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This Bug it's still happening. I left this open. There is a workaround to avoid it. We'll update the docs with it until it's fixed.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
We detected a kind of the same issue. Some routes were missing after a network outage of a few hours. I would expect flannel to reconcile these routes. I'm I right expecting this?
Which version of flannel are you using? Maybe your issue is not directly related to this. This issue was related to missing rules when flannel starts with multiple nodes. On you case seems that the rules were somehow removed and aren't recreated again.
I'm using v0.17.0 shipped with RKE1. I understand this is a kind of old version and my issue may have been fixed in the meantime. It looks like rancher is still shipping this version with the latest releases of RKE1.