High pasta and rootlessport CPU load
Issue Description
I'm encountering unexpectedly high CPU load from pasta and rootlessport when running certain network operations.
Scenario 1 – Downloading large files:
Downloading Hetzner test files (although any large file download should do) from within a rootless container.
Scenario 2 – Wireguard VPN server:
Hosting a Wireguard server in a rootless container using the docker.io/linuxserver/wireguard image.
Steps to reproduce the issue
Steps to reproduce the issue: Scenario 1:
- Run rootless container and download the test file to tmpfs:
podman container run \ --interactive \ --tty --rm \ --tmpfs /mnt \ --workdir /mnt \ debian /bin/bash -c 'apt update && apt install -y wget && wget https://ash-speed.hetzner.com/10GB.bin' - Monitor download speed reported by
wgetand CPU usage reported byhtop. If Hetzner's Ashburn datacenter is far from you, check their other locations and modify the URL's subdomain as needed.
Scenario 2:
- Create a custom network.
podman network create testnet - Spawn a Wireguard server in a rootless container.
podman container run \ --deattach \ --rm \ --cap-add NET_RAW \ --cap-add NET_ADMIN \ --sysctl net.ipv4.conf.all.forwarding=1 \ --env PEERS=1 \ --network=testnet \ --publish 51820:51820/udp \ docker.io/linuxserver/wireguard - Install the client profile on a separate device using either the QR code provided in the server container's log or the
peer1.conffile stored within the container itself. - Run a speedtest first with multiple connections and then with only a single connection enabled.
- Observe CPU load via
htop.
Describe the results you received
Scenario 1:
When downloading Hetzner test files with wget in a rootless container I reach consistent download speeds of around ~43MiB/s, but htop reports the pasta process using ~40-43% CPU load.
Scenario 2:
Running a network speedtest on the client using results in extremely slow network speeds reported on the client and very high CPU load on the server. htop shows 90-100% CPU load from the pasta process and 10-17% CPU load from several rootlessport processes.
Wireguard's poor performance renders it essentially unusable. I was told the CPU load is unexpectedly high and that I should raise a bug report here. Please let me know if what I am encountering is actually to be expected from rootless containers.
Describe the results you expected
I expected less severe CPU load from the rootless networking backend and better performance/network speeds via the Wireguard VPN tunnel.
podman info output
host:
arch: amd64
buildahVersion: 1.37.1
cgroupControllers:
- cpu
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-1:2.1.12-1
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: e8896631295ccb0bfdda4284f1751be19b483264'
cpuUtilization:
idlePercent: 96.76
systemPercent: 2.25
userPercent: 0.99
cpus: 4
databaseBackend: sqlite
distribution:
distribution: arch
version: unknown
eventLogger: journald
freeLocks: 2046
hostname: fern
idMappings:
gidmap:
- container_id: 0
host_id: 1003
size: 1
- container_id: 1
host_id: 100000
size: 65536
- container_id: 65537
host_id: 165536
size: 65536
uidmap:
- container_id: 0
host_id: 1001
size: 1
- container_id: 1
host_id: 100000
size: 65536
- container_id: 65537
host_id: 165536
size: 65536
kernel: 6.10.5-arch1-1
linkmode: dynamic
logDriver: journald
memFree: 15708397568
memTotal: 16651931648
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.12.1-1
path: /usr/lib/podman/aardvark-dns
version: aardvark-dns 1.12.1
package: netavark-1.12.2-1
path: /usr/lib/podman/netavark
version: netavark 1.12.2
ociRuntime:
name: crun
package: crun-1.16.1-1
path: /usr/bin/crun
version: |-
crun version 1.16.1
commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
rundir: /run/user/1001/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-2024_08_14.61c0b0d-1
version: |
pasta 2024_08_14.61c0b0d
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/user/1001/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: ""
package: ""
version: ""
swapFree: 6442446848
swapTotal: 6442446848
uptime: 0h 31m 3.00s
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries: {}
store:
configFile: /home/containers/.config/containers/storage.conf
containerStore:
number: 1
paused: 0
running: 1
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: /storage/containers/storage
graphRootAllocated: 6001156685824
graphRootUsed: 94199808
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 3
runRoot: /run/user/1001/containers
transientStore: false
volumePath: /storage/containers/storage/volumes
version:
APIVersion: 5.2.1
Built: 1723672228
BuiltTime: Wed Aug 14 17:50:28 2024
GitCommit: d0582c9e1e6c80cc08c3f042b91993c853ddcbc6
GoVersion: go1.23.0
Os: linux
OsArch: linux/amd64
Version: 5.2.1
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Running a fresh Arch installation on an Intel NUC5i7RYH, which is connected to my home's router via ethernet. No other services or containers running during testing.
Additional information
The experience I described above is consistent.
Thanks for reporting this!
Scenario 2: Running a network speedtest on the client using results in extremely slow network speeds reported on the client and very high CPU load on the server.
htopshows 90-100% CPU load from thepastaprocess and 10-17% CPU load from severalrootlessportprocesses.
I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.
About Scenario 1: I can reproduce something similar, in my case downloading from the fsn1 Hetzner area. I download at approximately 1gbps (110MB/s) and pasta uses approximately 70% of a CPU thread (wget is at ~50%).
The good news is that with higher transfer rates I get, on the same system, approximately 14gbps at 100% CPU thread load, and if I CPU-starve pasta with the test file download, I get approximately the same transfer rates. That is, if we have bigger chunks of data to transfer (because we have less CPU time), the CPU load doesn't increase linearly, so I don't see a practical issue with it.
Anyway, this is what perf:
$ perf record -g ./pasta --config-net
# wget https://fsn1-speed.hetzner.com/1GB.bin -O /dev/null
--2024-08-21 10:07:11-- https://fsn1-speed.hetzner.com/1GB.bin
Resolving fsn1-speed.hetzner.com (fsn1-speed.hetzner.com)... 2a01:4f8:0:a232::2, 78.46.170.2
Connecting to fsn1-speed.hetzner.com (fsn1-speed.hetzner.com)|2a01:4f8:0:a232::2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1073741824 (1.0G) [application/octet-stream]
Saving to: ‘/dev/null’
/dev/null 100%[===================>] 1.00G 109MB/s in 9.4s
2024-08-21 10:07:21 (109 MB/s) - ‘/dev/null’ saved [1073741824/1073741824]
#
logout
[ perf record: Woken up 18 times to write data ]
[ perf record: Captured and wrote 5.080 MB perf.data (37680 samples) ]
$ perf report
says about it:
Samples: 37K of event 'cycles', Event count (approx.): 30751209003
Children Self Command Shared Object Symbol
- 80.23% 0.56% passt.avx2 [kernel.kallsyms] [k] entry_SYSCALL_64_af◆
79.67% entry_SYSCALL_64_after_hwframe ▒
- do_syscall_64 ▒
+ 20.90% do_writev ▒
+ 19.63% __sys_recvmsg ▒
+ 13.53% __x64_sys_epoll_wait ▒
+ 5.35% ksys_read ▒
+ 4.93% __x64_sys_epoll_ctl ▒
+ 4.01% __x64_sys_timerfd_settime ▒
+ 3.86% __x64_sys_recvfrom ▒
+ 3.29% syscall_trace_enter.constprop.0 ▒
+ 2.75% syscall_exit_to_user_mode ▒
do_writev() is the system call writing to the tap interface of the container (we have to write one frame at a time, because tap gives us a file descriptor, not a socket), and __sys_recvmsg() are the reads from the socket.
There isn't much we can do about that, except for a planned VDUSE ¹ ² back-end. At that point, we'll get rid of those system calls and just "move" data between a shared memory ring and sockets.
The rest is something we could probably try to improve on in the shorter term, for example by trying to get bigger chunks of data at a time and reducing the wakeup frequency (__x64_sys_epoll_wait() and friends). I tried this:
diff --git a/tcp_buf.c b/tcp_buf.c
index c31e9f3..c15de64 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -477,6 +477,11 @@ int tcp_buf_data_from_sock(struct ctx *c, struct tcp_tap_conn *conn)
len = recvmsg(s, &mh_sock, MSG_PEEK);
while (len < 0 && errno == EINTR);
+ if (len > (1 << 16)) {
+ int lowat = 1 << 17;
+ setsockopt(s, SOL_SOCKET, SO_RCVLOWAT, &lowat, sizeof(lowat));
+ }
+
if (len < 0)
goto err;
CPU load is 15-18% at the same transfer rate. The transfer hangs at the end (this would need the same adaptive logic we have in tcp_splice.c), and we substantially decrease overhead from wakeups and bookkeeping:
Children Self Command Shared Object Symbol
- 91.64% 0.12% passt.avx2 [kernel.kallsyms] [k] entry_SYSCALL_64_af◆
91.53% entry_SYSCALL_64_after_hwframe ▒
- do_syscall_64 ▒
+ 36.24% __sys_recvmsg ▒
+ 24.94% ksys_read ▒
+ 20.90% do_writev ▒
+ 3.35% __x64_sys_epoll_wait ▒
+ 1.93% __x64_sys_recvfrom ▒
+ 1.08% __x64_sys_timerfd_settime ▒
0.73% syscall_exit_to_user_mode ▒
0.62% syscall_trace_enter.constprop.0 ▒
0.51% __x64_sys_setsockopt ▒
So this is something we could consider, even though, again, that CPU load shouldn't look that scary, because it's quite adaptive.
I didn't look into Scenario 2 yet.
I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.
After rebooting my system to start with a clean slate, I started my Wireguard container and observed eight rootlessport and six rootlessport-child processes spawn. I spent some time on the client using the network: Opening webpages, downloading Hetzner test files, etc. After several minutes I noticed that a couple more rootlessport processes had appeared, bringing the total number to ten. All of the rootlessport and rootlessport-child processes were reaped once I stopped the container.
As an additional piece of info, I am using Quadlet (systemctl --user start wireguard) to manage this container.
# wireguard.container
[Unit]
Description=Wireguard server quadlet
[Container]
Image=docker.io/linuxserver/wireguard:latest
AddCapability=NET_ADMIN
AddCapability=NET_RAW
Sysctl=net.ipv4.conf.all.forwarding=1
Environment=PEERS=1
Environment=INTERNAL_SUBNET=10.30.0.0/24
# Volume=wireguard-config.volume:/config
Network=wireguard.network
PublishPort=51820:51820/udp
[Service]
Restart=always
[Install]
WantedBy=multi-user.target default.target
No options aside from the heading ([Network]) are provided in the wireguard.network file.
jumping in to say that I'm experiencing the exact same issue, running almost an identical configuration as OP (I'm also using Quadlet to run the VPN container). Feels good not to be alone, this thing was driving me crazy. As a piece of additional information that could help zero in on this, I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so). Here's my podman info:
host:
arch: amd64
buildahVersion: 1.37.1
cgroupControllers:
- cpu
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.12-1.fc40.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.12, commit: '
cpuUtilization:
idlePercent: 98.31
systemPercent: 1.14
userPercent: 0.55
cpus: 4
databaseBackend: sqlite
distribution:
distribution: fedora
variant: cloud
version: "40"
eventLogger: journald
freeLocks: 2040
hostname: hekate
idMappings:
gidmap:
- container_id: 0
host_id: 1001
size: 1
- container_id: 1
host_id: 589824
size: 65536
uidmap:
- container_id: 0
host_id: 1001
size: 1
- container_id: 1
host_id: 589824
size: 65536
kernel: 6.10.6-200.fc40.x86_64
linkmode: dynamic
logDriver: journald
memFree: 933859328
memTotal: 6203207680
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.12.1-1.fc40.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.12.1
package: netavark-1.12.2-1.fc40.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.12.2
ociRuntime:
name: crun
package: crun-1.15-1.fc40.x86_64
path: /usr/bin/crun
version: |-
crun version 1.15
commit: e6eacaf4034e84185fd8780ac9262bbf57082278
rundir: /run/user/1001/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240814.g61c0b0d-1.fc40.x86_64
version: |
pasta 0^20240814.g61c0b0d-1.fc40.x86_64
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: true
path: /run/user/1001/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: ""
package: ""
version: ""
swapFree: 6202322944
swapTotal: 6202322944
uptime: 24h 59m 31.00s (Approximately 1.00 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
store:
configFile: /home/podman/.config/containers/storage.conf
containerStore:
number: 4
paused: 0
running: 4
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/podman/.local/share/containers/storage
graphRootAllocated: 428340129792
graphRootUsed: 24915517440
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 6
runRoot: /run/user/1001/containers
transientStore: false
volumePath: /home/podman/.local/share/containers/storage/volumes
version:
APIVersion: 5.2.1
Built: 1723593600
BuiltTime: Wed Aug 14 02:00:00 2024
GitCommit: ""
GoVersion: go1.22.5
Os: linux
OsArch: linux/amd64
Version: 5.2.1
please let me know if there's anything else I can provide to help with this.
I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so)
Do you happen to know starting from which version of Podman and pasta (the package is named passt) you are seeing this?
I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so)
Do you happen to know starting from which version of Podman and pasta (the package is named
passt) you are seeing this?
I started noticing the issue around the beginning of August. Looking at my DNF history I can make some educated guesses. it could be either this update (applied the 25 of July):
Packages Altered:
Upgrade podman-5:5.2.0~rc2-1.fc40.x86_64 @updates-testing
Upgraded podman-5:5.1.2-1.fc40.x86_64 @@System
or this one (August 4th):
Packages Altered:
Upgrade aardvark-dns-2:1.12.1-1.fc40.x86_64 @updates-testing
Upgraded aardvark-dns-2:1.11.0-3.fc40.x86_64 @@System
Upgrade glibc-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-2.39-17.fc40.x86_64 @@System
Upgrade glibc-common-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-common-2.39-17.fc40.x86_64 @@System
Upgrade glibc-gconv-extra-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-gconv-extra-2.39-17.fc40.x86_64 @@System
Upgrade glibc-langpack-en-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-langpack-en-2.39-17.fc40.x86_64 @@System
Upgrade glibc-locale-source-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-locale-source-2.39-17.fc40.x86_64 @@System
Upgrade glibc-minimal-langpack-2.39-22.fc40.x86_64 @updates-testing
Upgraded glibc-minimal-langpack-2.39-17.fc40.x86_64 @@System
Upgrade libgcc-14.2.1-1.fc40.x86_64 @updates-testing
Upgraded libgcc-14.1.1-7.fc40.x86_64 @@System
Upgrade libgomp-14.2.1-1.fc40.x86_64 @updates-testing
Upgraded libgomp-14.1.1-7.fc40.x86_64 @@System
Upgrade libstdc++-14.2.1-1.fc40.x86_64 @updates-testing
Upgraded libstdc++-14.1.1-7.fc40.x86_64 @@System
Upgrade netavark-2:1.12.1-1.fc40.x86_64 @updates-testing
Upgraded netavark-2:1.11.0-3.fc40.x86_64 @@System
Upgrade podman-5:5.2.0-1.fc40.x86_64 @updates-testing
Upgraded podman-5:5.2.0~rc2-1.fc40.x86_64 @@System
The only update I see for passt seems too recent (16th of August)
my bad. There was another update on the 28 of July. Sorry I missed it.
Packages Altered:
Upgrade passt-0^20240726.g57a21d2-1.fc40.x86_64 @updates-testing
Upgraded passt-0^20240624.g1ee2eca-1.fc40.x86_64 @@System
Upgrade passt-selinux-0^20240726.g57a21d2-1.fc40.noarch @updates-testing
Upgraded passt-selinux-0^20240624.g1ee2eca-1.fc40.noarch @@System
the one before was of on June the 25th, but it feels too early and the the one after that was on August 9 (too late). hope this helps
Any chance you could try downgrading to passt-0^20240624.g1ee2eca-1.fc40.x86_64 just for a quick test and see if that helps with CPU load?
Any chance you could try downgrading to
passt-0^20240624.g1ee2eca-1.fc40.x86_64just for a quick test and see if that helps with CPU load?
I can't seem to find the old package. Not easily at least. do you know where I can get it?
I managed to dowgrade to passt 0^20240326.g4988e2b-1.fc40.x86_64 and the issue seems in fact mostly resolved. I still see a lot of rootlessport processes (see below), but CPU usage has improved greatly and the server is back to being usable.
podman 1138 0.5 0.4 72424 27420 ? Ss 14:16 0:01 /usr/bin/pasta --config-net --address 10.0.2.0 --netmask 24 --gateway 10.0.2.2 --dns-forward 10.0.2.3 --pid /run/user/1001/containers/networks/rootless-netns/rootless-netns-conn.pid -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1001/containers/networks/rootless-netns/rootless-netns
podman 1263 0.0 0.0 1745796 5268 ? Sl 14:16 0:00 rootlessport
podman 1273 0.0 0.0 1598332 4836 ? Sl 14:16 0:00 rootlessport-child
podman 1377 1.0 0.0 1819784 5696 ? Sl 14:16 0:02 rootlessport
podman 1382 0.0 0.0 1524344 4988 ? Sl 14:16 0:00 rootlessport-child
It is notably better, but I can't say for certain that it is back to "normal". I will have to invest some more time checking and testing.
I can't seem to find the old package. Not easily at least. do you know where I can get it?
I'm not sure how you solved this, and for how long Fedora packages are retained, but we keep a permanent mirror of Copr builds (including Fedora packages) at https://passt.top/builds/copr/.
I managed to dowgrade to
passt 0^20240326.g4988e2b-1.fc40.x86_64and the issue seems in fact mostly resolved.
It is notably better, but I can't say for certain that it is back to "normal". I will have to invest some more time checking and testing.
Thanks, this is extremely helpful.
Thank you! I used the mirror and I was able to downgrade to passt-0^20240624.g1ee2eca-1.fc40.x86_64. I can confirm that this version is working fine (as far as I can tell).
I also did a quick test and updated back to the latest version available on Fedora 40 (passt-0^20240814.g61c0b0d-1.fc40.x86_64) and CPU usage is back to be abnormally high (60%-70% with only wireguard running). Pasta (see below) is also back to being the process with the highest CPU usage.
/usr/bin/pasta --config-net --address 10.0.2.0 --netmask 24 --gateway 10.0.2.2 --dns-forward 10.0.2.3 --pid /run/user/1001/containers/networks/rootless-netns/rootless-netns-conn.pid -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1001/containers/networks/rootless-netns/rootless-netns
I tested passt-0^20240726.g57a21d2-1.fc40.x86_64 and it looks like this is the first problematic version. I hope this helps narrow down the problem.
I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.
After rebooting my system to start with a clean slate, I started my Wireguard container and observed eight
rootlessportand sixrootlessport-childprocesses spawn. I spent some time on the client using the network: Opening webpages, downloading Hetzner test files, etc. After several minutes I noticed that a couple morerootlessportprocesses had appeared, bringing the total number to ten. All of therootlessportandrootlessport-childprocesses were reaped once I stopped the container.
Your original reproducer didn't show you using a custom network, that uses a different architecture from the default pasta network mode, i.e. see https://github.com/containers/podman/discussions/22943#discussioncomment-9795883
So using rootlessport is normal when using custom networks. And 10-17% CPU doesn't sound particular hight to me. The process must proxy all data which is not exactly cheap.
But I do agree that pasta numbers look way to high for the amount of throughput.
@pwige Hi. I'm looking specifically at Scenario 2. As @sbrivio-rh notes that looks like a regression caused by the flow table, which I imlpemented.
Unfortunately, I haven't yet been able to reproduce the problem. I ran the wireguard container as described and ran a speedtest through it. While the transfer speeds are significantly slower than without the tunnel, they were still respectable (6-18 Mbps depending on which exact variant). pasta CPU usage topped out at 10-20%. I'm not seeing any rootlessport processes. There are a lot of possible differences between my setup and yours, so I don't have a quick guess as to which one is making the difference.
So, here's a grab bag of questions, hoping that something will give a clue as to what the triggering circumstances are:
- Can you roughly quantify the "extremely slow" speeds you're seeing.
1.1. Roughly what throughput does speedtest report through wireguard+pasta?
1.2. Roughly what throughput does it report from the same client without the wireguard tunnel?
1.3. Roughly what throughput is reported using wireguard+pasta, but with the older pasta version (
passt-0^20240624.g1ee2eca-1.fc40.x86_64or earlier)? - What's the connection between the client machine and the wireguard server? Are they different containers (or VMs) on the same host? Is the client on a physically different machine on the same physical network? On an entirely different network?
- Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
- What sort of host is the wireguard container running on? In particular how many cores & threads does it have?
@pwige Oh, sorry, one more 5.
Run a speedtest first with multiple connections and then with only a single connection enabled.
I'm not entirely sure what you mean here. I couldn't see any obvious options for multiple vs. single connections.
@pwige Oh, sorry, one more 5.
Run a speedtest first with multiple connections and then with only a single connection enabled.
I'm not entirely sure what you mean here. I couldn't see any obvious options for multiple vs. single connections.
Sorry, somehow missed that option on the speedtest page. Found it now.
Did some further testing, and I sometimes see pasta CPU load up to 45-50%, but throughput still seems respectable.
Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.
I'm using version 2024_08_21.1d6142f and also experiencing very high load, my use case is a bit different.
I'm running qbittorrentofficial/qbittorrent-nox rootless and I'm observing 100% CPU load for the passt process despite setting the network concurrency settings of the program in question to lower values than default.
passt floats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.
Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.
EDIT: I should've mentioned that this problem was happening using the default podman bridge network. I say this because @kalelkenobi reportedly (see later comments) only had problems using a custom network.
Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.
Ok, so I'm a bit confused at this point. The original instructions for reproducing don't seem to be setting up a custom network - just starting a regular container. However the presence of rootlessport seems to indicate that there is a custom network in play. @pwige can you shed some light on what network configuration you're using?
I'm using version
2024_08_21.1d6142fand also experiencing very high load, my use case is a bit different.I'm running
qbittorrentofficial/qbittorrent-noxrootless and I'm observing 100% CPU load for thepasstprocess despite setting the network concurrency settings of the program in question to lower values than default.
passtfloats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.
Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.
Huh, interesting. The good news is this strongly suggests we're doing something Just Plain Wrong (as opposed to a more subtle inefficiency), and once we find it, it should be a relatively easy fix. The bad news is that I don't yet have any good theories as to what's going on.
Did some further testing thanks to you guys pointing out that rootless port is only needed in cases where a custom network is used and I can confirm that the problem is NOT present when using the default network (or the host network). Sorry if I've not mentioned this before, but this was out of sheer ignorance on my part. Please let me know if there's anything else I can test or information I can provide about this.
P.S. this whole thing has been hugely educational for me and has made me realize that my setup (where I have multiple containers using a custom network) is not very efficient CPU wise. Could you guys give me a pointer as to what would be a better way to have multiple containers communicate with each other? If that needs to be a longer conversation I'd be happy to post a question in discussion or do some further research myself. Thanks anyway :)
@dgibson Hi!
1. Can you roughly quantify the "extremely slow" speeds you're seeing. 1.1. Roughly what throughput does speedtest report through wireguard+pasta? 1.2. Roughly what throughput does it report from the same client without the wireguard tunnel?
The extremely slow speeds I had previously seen were around 0.5MB/s down and 0.1MB/s up. A "clean" speedtest where I'm not connected to the vpn gives me ~450MB/s down and ~165MB/s up.
Here are two videos showing the speedtest on the client and the CPU usage on the server as they happened in real-time. The client was my laptop, which exists on the same physical network as the server.
https://github.com/user-attachments/assets/9160b14a-54cf-4ecd-8aa9-cbf06eba2531
https://github.com/user-attachments/assets/20268a45-ebc0-4b72-9b70-57aa509819a4
Despite the speeds shown here being not nearly as bad as they were previously (no complaints here), the CPU usage isn't noticeably different. I'm not sure what could account for the change in throughput other than general instability from my ISP, as I haven't changed anything on my end. It make me a bit embarrassed tbh. I haven't had the opportunity today to perform this test when connected to a different physical network, but may have the chance to tomorrow.
1.3. Roughly what throughput is reported using wireguard+pasta, but with the older pasta version (`passt-0^20240624.g1ee2eca-1.fc40.x86_64` or earlier)?
I'm on Arch, so I can't use any of the Copr builds. I can build from source from tags 2024_06_24.1ee2eca or 2024_06_07.8a83b53. Please let me know if simply copying the resulting passt binary to /usr/local/bin is enough for testing. Thanks.
2. What's the connection between the client machine and the wireguard server? Are they different containers (or VMs) on the same host? Is the client on a physically different machine on the same physical network? On an entirely different network?
The server is running in a rootless container directly on the host. The client(s) are other devices connected to the same physical network. As mentioned above, I might have the chance to test from different physical networks tomorrow.
3. Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
The first eight appear immediately after starting the container. I can get an additional rootlessport process to spawn when I first connect a client to the wireguard vpn. The maximum I've seen is ten.
4. What sort of host is the wireguard container running on? In particular how many cores & threads does it have?
I am using an Intel NUC5i7RYH, with the containers running directly on the host. I haven't imposed any resource limits on the container, so the it have unimpeded access to all 2 cores (4 threads) and all 16GB of RAM. The device itself is connected to my home LAN via ethernet.
Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.
Ok, so I'm a bit confused at this point. The original instructions for reproducing don't seem to be setting up a custom network - just starting a regular container. However the presence of rootlessport seems to indicate that there is a custom network in play. @pwige can you shed some light on what network configuration you're using?
The steps I provided to reproduce the issue don't use a custom network, something which I am doing in "production." That's an oversight on my part. Sorry! I've updated the issue's original text to reflect this.
My network configuration is defined in a Quadlet .network file containing nothing but an empty [Network] section. Here is the output of podman network inspect:
[
{
"name": "systemd-wireguard",
"id": "41f1cabb2b1a194ec63a730798a6d972cba65b22699be76714cf6259558c207c",
"driver": "bridge",
"network_interface": "podman1",
"created": "2024-08-27T17:36:04.729533184-04:00",
"subnets": [
{
"subnet": "10.89.0.0/24",
"gateway": "10.89.0.1"
}
],
"ipv6_enabled": false,
"internal": false,
"dns_enabled": true,
"ipam_options": {
"driver": "host-local"
},
"containers": {
"f9bacc5a3db909b3362249fa39088241d3a842cb3d3868d04d410c6cf3fbe53d": {
"name": "systemd-wireguard",
"interfaces": {
"eth0": {
"subnets": [
{
"ipnet": "10.89.0.7/24",
"gateway": "10.89.0.1"
}
],
"mac_address": "3a:4a:98:4d:6e:7f"
}
}
}
}
}
]
By the way:
I'm running
qbittorrentofficial/qbittorrent-noxrootless and I'm observing 100% CPU load for thepasstprocess despite setting the network concurrency settings of the program in question to lower values than default.
passtfloats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.
I tried to reproduce this case, but I couldn't, at least not yet. Perhaps it's because of the torrent content I chose (Linux distribution images): I download very fast (50-100 MB/s, and pasta reaches 40-50% load on a CPU thread) but seeding is not constantly using so much bandwidth. I see spikes in upload rates from time to time, but otherwise I'm constantly seeding at some dozen KB/s to a handful of peers.
It would be interesting if you could share the output of one second or less of strace -f -p <pasta's PID> when the CPU load is unexpectedly high (that is, when you don't expect any substantial traffic). You'll need to run strace as root, because pasta makes itself un-ptrace()able by regular users.
@pwige, thanks for the additional information. Between your comment and the one from @kalelkenobi it certainly seems like this is triggered by the custom network setup. I'm suspecting maybe some sort of forwarding loop between pasta and rootlesskit.
I'm not really familiar with quadlet and custom network configuration though, @Luap99 any chance you could interpret @pwige's configuration into specific commands I'd need to run to create something similar?
@sbrivio-rh discussed this and attempted a number of things. Alas we still don't have any clear reproducer.
I've been focusing on the wireguard container case. Using a podman custom network I now see rootlessport processes, but the bandwidth and cpu load didn't change notably. I do see ~40% load which is high-ish, but it didn't change appreciably between the 20240624 and current versions of pasta. I have a pretty fast CPU, but a pretty modest uplink, so it's possible I'm just not able to generate enough throughput to trigger the problem using an external server like speedtest.
One thing we realised is that we were thinking the wireguard case would be exercising pasta's UDP path, but that's not the case. With the custom network it's rootlesskit forwarding the UDP packets for the tunnel to the wireguard server; pasta is just handling the on-traffic from there to the speedtest server. I'm guessing speedtest uses TCP primarily, so all the instances here might be purely TCP related.
@kalelkenobi, I realise your case is similar to @pwige's, but I think it would help if you can describe your reproducer as precisely as you can. What are all the steps to set up your configuration, and what exactly is the difference you're seeing between the old and new pasta versions? That might help us to focus in on what's triggering the poor behaviour.
I'm not really familiar with quadlet and custom network configuration though, @Luap99 any chance you could interpret @pwige's configuration into specific commands I'd need to run to create something similar?
Running with the custom network code path is as simple as podman run --network bridge ... as rootless
One thing we realised is that we were thinking the wireguard case would be exercising pasta's UDP path, but that's not the case. With the custom network it's rootlesskit forwarding the UDP packets for the tunnel to the wireguard server; pasta is just handling the on-traffic from there to the speedtest server. I'm guessing speedtest uses TCP primarily, so all the instances here might be purely TCP related.
Yes exactly this is what I meant above, all the port forwarding traffic goes through rootlessport and that didn't change I think.
The one thing that is missing here is how wireguard is configured, there really isn't a strict server client model so if the server config has a Peer with and Endpoint it would use that to connect rather than the incoming udp connection from rootlessport, so I could see a case where we have some for of async routing incoming via rootlesport outgoing via pasta. Given udp is stateless that really shouldn't matter but it may play a role here if that is the case.
I think running tcpdump in the container may help to see what traffic is being generated, i.e. podman unshare nsenter -n$(podman inspect --format '{{.NetworkSettings.SandboxKey}}' <container name>) tcpdump -nn -i any
But yes for the simple download from within the container it should be only tcp path in pasta, though in case of the custom networks there is the extra NAT, bridge + veth setup as shown here https://github.com/containers/podman/discussions/22943#discussioncomment-9795883 but I don't think it would cause issues.
By the way:
I'm running
qbittorrentofficial/qbittorrent-noxrootless and I'm observing 100% CPU load for thepasstprocess despite setting the network concurrency settings of the program in question to lower values than default.passtfloats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark. Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.I tried to reproduce this case, but I couldn't, at least not yet. Perhaps it's because of the torrent content I chose (Linux distribution images): I download very fast (50-100 MB/s, and pasta reaches 40-50% load on a CPU thread) but seeding is not constantly using so much bandwidth. I see spikes in upload rates from time to time, but otherwise I'm constantly seeding at some dozen KB/s to a handful of peers.
It would be interesting if you could share the output of one second or less of
strace -f -p <pasta's PID>when the CPU load is unexpectedly high (that is, when you don't expect any substantial traffic). You'll need to runstraceas root, because pasta makes itself un-ptrace()able by regular users.
I'm having the same-ish issue with linuxserver/qbittorrent running rootless. Here is the output of strace -f -p (for about 5 seconds). log.txt