moby
moby copied to clipboard
Network packet loss with default route containing two equal cost paths.
I believe I've run into a Docker bug, related to how NAT is performed on machines that contain a default route with multiple equal cost paths.
When running inside of docker I see 30-50% packet loss, depending on activity, and all TCP session stall out. With ping 8.8.8.8 I get a consistent 30% packet loss. Outside of docker, or with --net=host that drops to 0%.
If I shut down one of the NICs (causing the path to drop to a single NIC) then I get 0% packet loss within Docker as well.
I believe that this must be to do with the way Linux handles iptables/NAT out to the host, but I'm not sure how to troubleshoot further. I can provide you guys with a bunch of tcpdump information but there's very little of use to be seen. I see it doing NAT to my external IP and packets leaving the machine.
Let me know if you have any suggestions on how I can troubleshoot further or what kind of details you'd want to see.
[root@node01 conf.d]# ip route
default proto zebra src 66.150.120.2 metric 20
nexthop via 169.254.0.1 dev ens1 weight 1 onlink
nexthop via 169.254.0.1 dev ens1d1 weight 1 onlink
10.0.0.0/8 src 10.0.0.2
nexthop via 169.254.0.1 dev ens1 weight 1 onlink
nexthop via 169.254.0.1 dev ens1d1 weight 1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.254.0/24 dev enp2s0f0.254 proto kernel scope link src 192.168.254.1 metric 400
root@40b3d82896ab:/# ping -f 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
^.......................................................................................................--- 8.8.8.8 ping statistics ---
489 packets transmitted, 252 packets received, 48% packet loss
round-trip min/avg/max/stddev = 8.395/8.445/8.601/0.000 ms
root@40b3d82896ab:/#
root@40b3d82896ab:/# exit
[root@node01 ~]# ping -f 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
.^C
--- 8.8.8.8 ping statistics ---
700 packets transmitted, 699 received, 0% packet loss, time 5872ms
rtt min/avg/max/mdev = 8.294/8.340/8.404/0.111 ms, ipg/ewma 8.401/8.342 ms
docker version: 1.9 (and 1.10) docker info:
[root@node01 ~]# docker info
Containers: 17
Images: 368
Server Version: 1.9.1
Storage Driver: devicemapper
Pool Name: sysvg-docker--pool
Pool Blocksize: 524.3 kB
Base Device Size: 107.4 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 12.9 GB
Data Space Total: 49.66 GB
Data Space Available: 36.76 GB
Metadata Space Used: 6.627 MB
Metadata Space Total: 478.2 MB
Metadata Space Available: 471.5 MB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.109 (2015-09-22)
Execution Driver: native-0.2
Logging Driver: journald
Kernel Version: 4.3.4-300.fc23.x86_64
Operating System: Fedora 23 (Twenty Three)
CPUs: 24
Total Memory: 62.81 GiB
Name: node01.us-east.mgmt.ntoggle.com
ID: XR2B:YJGE:4BX4:K3RF:YGEG:DKNT:3JPY:OKGU:4P6E:NJO6:IWFG:VDAJ
Username: apenney
Registry: https://index.docker.io/v1/
Provide additional environment details (AWS, VirtualBox, physical, etc.): Bare metal, 4 NICs per box.
@apenney as discussed in IRC, we will work together to get to the bottom of it. Lets keep this issue open till we find a solution.
ping @mavenugo @aboch @mrjana @sanimej
This is an ancient bug, Could this possibly still be an issue that needs to be looked into, or should we close it?