slirp4netns
slirp4netns copied to clipboard
MTU-related severe performance issues
This may be the same thing as https://github.com/rootless-containers/slirp4netns/issues/128 , as that also has a 5 second delay at the beginning, but I can't tell and I have a lot of detail so I didn't want to clutter it up.
Short version: in a rootless podman with default everything (i.e. slirp4netns with default MTU of 65520), curl of a file with a size above the MTU takes 5 seconds when it should be much much less than a second. Reducing the MTU fixes it.
My environment:
$ sudo yum list installed '*podman*' '*slirp*'
[sudo] password for rlpowell:
Installed Packages
libslirp.x86_64 4.6.1-2.fc35 @fedora
podman.x86_64 3:3.4.4-1.fc35 @updates
podman-gvproxy.x86_64 3:3.4.4-1.fc35 @updates
podman-plugins.x86_64 3:3.4.4-1.fc35 @updates
slirp4netns.x86_64 1.1.12-2.fc35 @fedora
$ cat /etc/redhat-release
Fedora release 35 (Thirty Five)
Repro:
Dockerfile:
FROM fedora:35
RUN yum -y install netcat time
Run podman build -t slirptest .
In another window on the same host (maybe in a temp dir):
$ dd bs=1024 count=64 if=/dev/zero of=64k_file.bin
64+0 records in
64+0 records out
65536 bytes (66 kB, 64 KiB) copied, 0.000972638 s, 67.4 MB/s
$ dd bs=1024 count=63 if=/dev/zero of=63k_file.bin
63+0 records in
63+0 records out
64512 bytes (65 kB, 63 KiB) copied, 0.000511444 s, 126 MB/s
$ python -m http.server 8081
Serving HTTP on 0.0.0.0 port 8081 (http://0.0.0.0:8081/) ...
In the another window:
$ podman run --rm -it slirptest bash
# time curl [IP of host]:8081/64k_file.bin | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 65536 100 65536 0 0 13108 0 0:00:04 0:00:04 --:--:-- 8627
0 0 65536
real 0m5.012s
user 0m0.013s
sys 0m0.008s
Note that it pauses for about 5 seconds after the first chunk of data.
Then try:
$ podman run --rm -it --net slirp4netns:mtu=1500 slirptest bash
# time curl 192.168.123.134:8081/64k_file.bin | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 65536 100 65536 0 0 19.3M 0 --:--:-- --:--:-- --:--:-- 31.2M
0 0 65536
real 0m0.016s
user 0m0.008s
sys 0m0.010s
500x performance difference. :D
Also:
$ podman run --rm -it slirptest bash
# time curl 192.168.123.134:8081/63k_file.bin | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 64512 100 64512 0 0 22.0M 0 --:--:-- --:--:-- --:--:-- 30.7M
0 0 64512
real 0m0.016s
user 0m0.008s
sys 0m0.010s
So the 64k file causes the problem but the 63k file does not.
In case it's relevant, here's the host's mtu configs:
$ ip addr | grep -i mtu
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: enp5s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
The communication in question appears, to tcpdump, to come over lo
.
My binary search shows that the issue doesn't occur at mtu=48000 and lower, but does occur at mtu=48500 and higher. I have no idea what the significance of that is.
$ podman run --rm -it --net slirp4netns:mtu=48000 slirptest bash
[root@4b1d4880bb30 /]# time curl 192.168.123.134:8081/64k_file.bin | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 65536 100 65536 0 0 21.2M 0 --:--:-- --:--:-- --:--:-- 31.2M
0 0 65536
real 0m0.016s
user 0m0.010s
sys 0m0.008s
[root@4b1d4880bb30 /]#
exit
$ podman run --rm -it --net slirp4netns:mtu=48500 slirptest bash
[root@99a0585cffb0 /]# time curl 192.168.123.134:8081/64k_file.bin | wc
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 65536 100 65536 0 0 11401 0 0:00:05 0:00:05 --:--:-- 7208
0 0 65536
real 0m5.762s
user 0m0.008s
sys 0m0.013s
[root@99a0585cffb0 /]#
I just wanted to add that I have severe performance issues with the default MTU of 65520, too. Running on a Windows Host with a Linux Guest VM (Ubuntu 20.04) and running Iperf3 from a container running on the VM and connecting to the Host: Outside of container: 5 Gbits/s Rootfull container: 5 Gbits/s Rootless container MTU 1500: 1.5 Gbits/s Rootless container MTU 65520: 60 Mbits/s (yes, Megabits)
EDIT: For completeness here the stats when connecting from the host to the container with port-driver slirp4netns. No MTU dependent slowdown here. Outside of container: 3 Gbits/s Rootfull container: 3 Gbits/s Rootless container MTU 1500: 1.6 Gbits/s Rootless container MTU 65520: 1.8 Gbits/s
EDIT again: For the tests I was using Docker Rootless v20.10.12.
Can verify the above. In a rootless Docker environment using nginx 1.20 to proxy internal containers, we were seeing many of the requests to nginx take 10+ seconds while requests directly to the proxied services took less than a second. MTU on the Docker daemon was set to 65520. Reducing MTU to 48000 fixed this issue.
Using docker run --sysctl net.ipv4.tcp_rmem="4096 87380 6291456"
as suggested in the slirp4netns README did not fix the issue and seemed to have no effect, although it's possible this was user error ¯\(ツ)/¯
Ubuntu 20.04.2 Docker (rootless) 20.10.6 slirp4netns 0.4.3
Same issue. Using podman 4.0.2
and slirp4netns 1.1.12
on AlmaLinux 9.0. Though in my case MTU ~20000 is the stable point.
MTU=1500 gives ~1Gbit/s
MTU=20000 gives ~9Gbit/s (about the same as --network=host
)
MTU=48000 it goes back to ~1Gbit/s
MTU=65520 it goes below 1Gbit/s
These are iperf results between the container and another server in the same subnet. --sysctl net.ipv4.tcp_rmem="4096 87380 6291456"
didn't help either.