socket_vmnet icon indicating copy to clipboard operation
socket_vmnet copied to clipboard

Bad scaling due to flooding and overhead of copying packets in limactl

Open nirs opened this issue 1 year ago • 3 comments

Throughput decreases and cpu usage increases significantly when adding more vms connected to the same socket_vmnet daemon.

Tested using:

  • host: running iperf3 -c ...
  • server vm: running iper3 -s
  • 1-4 additional idle vms
vms bitrate (Gbits/sec) cpu (%)
1 3.52 51.23
2 2.42 58.17
3 1.22 81.28
4 0.81 93.07

Expected behavior

  • Performance and cpu usage should remain the same when adding more idle vms
  • Packets sent to one vm should not be forwarded to other vms
  • Packets should be copied directly to vz datagram socket in socket_vmnet, bypassing limactl

Why it happens

When we have multiple vms connected to socket_vmnet:

  • every packet sent from the vmnet interface is forwarded to every vm, instead of the vm with the right mac address.
  • every packet sent from any vm is forwarded to all other vms, and vmnet inteterface, instead of one of the vm or only vmnet interface
  • when a packet is forwarded to a vm, it is copied to the vz datagram socket via a socket pair in limactl
  • packets forwarded from limactl to the vz are copied and processed in the guest, where they are dropped (since the packets are not related to the guest).

Flow when receiving a packet from vmnet with 4 vms

host iperf3 ->
  host kernel ->
    vmnet -> 
      socket_vmnet ->
        host kernel ->
          limactl ->
            host kernel ->
              vz -> 
                guest kernel ->
                  guest iperf3
        host kernel ->
          limactl ->
            host kernel ->
              vz -> 
                guest kernel (drop)
        host kernel ->
          limactl ->
            host kernel ->
              vz -> 
                guest kernel (drop)
        host kernel ->
          limactl ->
            host kernel ->
              vz -> 
                guest kernel (drop)

Flow when receiving a packet from a vm

guest iperf3 ->
  guest kernel ->
    vz ->
      host kernel ->
        limactl ->
          host kernel ->
            socket_vmnet ->
              vmnet ->
                host_kernel ->
                  host iperf3
                host kernel ->
                  limactl ->
                    host kernel ->
                      vz -> 
                        guest kernel (drop)
                host kernel ->
                  limactl ->
                    host kernel ->
                      vz -> 
                        guest kernel (drop)
                host kernel ->
                  limactl ->
                    host kernel ->
                      vz -> 
                        guest kernel (drop)

CPU usage for all vms processes

Looking at cpu usage of socket_vmnet, vm service processes, and limactl processes, we see that there is extreme cpu usage related with processing partly or completely unrelated packets:

command %cpu related
com.apple.Virtua 136.9 yes
limactl 121.4 yes
iperf3-darwin 13.7 yes
socket_vmnet 106.6 partly
kernel_task 39.1 partly
com.apple.Virtua 83.5 no
com.apple.Virtua 81.0 no
com.apple.Virtua 77.4 no
limactl 67.1 no
limactl 65.6 no
limactl 62.9 no

Total cpu usage:

work %cpu
related 272.0
partly 145.7
unrelated 437.5

Tested on M1 Pro (8 performance cores, 2 efficiency cores)

Full results

1 vm

% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[  5] local 192.168.105.1 port 60990 connected to 192.168.105.58 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  5]   0.00-1.00   sec   460 MBytes  3.86 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   1.00-2.00   sec   421 MBytes  3.53 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   2.00-3.00   sec   435 MBytes  3.65 Gbits/sec    0   8.00 MBytes   10ms     
[  5]   3.00-4.00   sec   411 MBytes  3.45 Gbits/sec    0   8.00 MBytes   14ms     
[  5]   4.00-5.00   sec   317 MBytes  2.66 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   5.00-6.00   sec   430 MBytes  3.61 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   6.00-7.00   sec   423 MBytes  3.55 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   7.00-8.00   sec   433 MBytes  3.63 Gbits/sec    0   8.00 MBytes   10ms     
[  5]   8.00-9.00   sec   437 MBytes  3.67 Gbits/sec    0   8.00 MBytes   9ms     
[  5]   9.00-10.00  sec   430 MBytes  3.61 Gbits/sec    0   8.00 MBytes   9ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.10 GBytes  3.52 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.10 GBytes  3.52 Gbits/sec                  receiver

cpu usage

CPU usage: 20.3% user, 31.19% sys, 48.77% idle 

PID    COMMAND          %CPU  #TH   
49183  com.apple.Virtua 166.3 19/3  
49173  limactl          100.0 16/2  
48954  socket_vmnet     64.4  5/1   
0      kernel_task      57.8  561/10
54694  iperf3-darwin    18.6  1/1   

2 vms

% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[  5] local 192.168.105.1 port 60997 connected to 192.168.105.58 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  5]   0.00-1.00   sec   269 MBytes  2.26 Gbits/sec    0   8.00 MBytes   13ms     
[  5]   1.00-2.00   sec   299 MBytes  2.51 Gbits/sec    0   8.00 MBytes   14ms     
[  5]   2.00-3.00   sec   263 MBytes  2.21 Gbits/sec    0   8.00 MBytes   15ms     
[  5]   3.00-4.00   sec   296 MBytes  2.48 Gbits/sec    0   8.00 MBytes   13ms     
[  5]   4.00-5.00   sec   298 MBytes  2.50 Gbits/sec    0   8.00 MBytes   12ms     
[  5]   5.00-6.00   sec   284 MBytes  2.38 Gbits/sec    0   8.00 MBytes   13ms     
[  5]   6.00-7.00   sec   299 MBytes  2.51 Gbits/sec    0   8.00 MBytes   14ms     
[  5]   7.00-8.00   sec   298 MBytes  2.50 Gbits/sec    0   8.00 MBytes   14ms     
[  5]   8.00-9.00   sec   285 MBytes  2.39 Gbits/sec    0   8.00 MBytes   13ms     
[  5]   9.00-10.00  sec   298 MBytes  2.50 Gbits/sec    0   8.00 MBytes   12ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.82 GBytes  2.42 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  2.82 GBytes  2.42 Gbits/sec                  receiver

cpu usage

CPU usage: 20.84% user, 37.32% sys, 41.83% idle 

PID    COMMAND          %CPU  #TH   
49183  com.apple.Virtua 132.9 18/2  
49173  limactl          92.2  16/3  
48954  socket_vmnet     77.0  6/1   
49905  com.apple.Virtua 74.2  18/1  
49900  limactl          57.3  16/1  
0      kernel_task      41.4  561/12
54259  iperf3-darwin    22.1  1/1   

3 vms

% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[  5] local 192.168.105.1 port 61004 connected to 192.168.105.58 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  5]   0.00-1.00   sec   161 MBytes  1.35 Gbits/sec    0   2.91 MBytes   21ms     
[  5]   1.00-2.00   sec   138 MBytes  1.16 Gbits/sec    0   3.05 MBytes   17ms     
[  5]   2.00-3.00   sec   143 MBytes  1.20 Gbits/sec    0   3.15 MBytes   44ms     
[  5]   3.00-4.00   sec   139 MBytes  1.17 Gbits/sec    0   3.24 MBytes   19ms     
[  5]   4.00-5.00   sec   138 MBytes  1.16 Gbits/sec    0   3.30 MBytes   25ms     
[  5]   5.00-6.00   sec   144 MBytes  1.21 Gbits/sec    0   3.34 MBytes   22ms     
[  5]   6.00-7.00   sec   154 MBytes  1.29 Gbits/sec    0   3.37 MBytes   23ms     
[  5]   7.00-8.00   sec   145 MBytes  1.21 Gbits/sec    0   3.38 MBytes   15ms     
[  5]   8.00-9.00   sec   142 MBytes  1.19 Gbits/sec    0   3.39 MBytes   17ms     
[  5]   9.00-10.00  sec   154 MBytes  1.29 Gbits/sec    0   3.39 MBytes   23ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.42 GBytes  1.22 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  1.42 GBytes  1.22 Gbits/sec                  receiver

cpu usage

CPU usage: 24.13% user, 57.13% sys, 18.72% idle 

PID    COMMAND          %CPU  #TH   
49183  com.apple.Virtua 145.8 18/2  
49173  limactl          120.5 15/2  
48954  socket_vmnet     99.8  7/2   
49905  com.apple.Virtua 82.9  18/1  
50380  com.apple.Virtua 82.1  18/1  
50375  limactl          63.4  16/1  
49900  limactl          61.7  16/1  
0      kernel_task      43.4  561/11
53677  iperf3-darwin    15.2  1/1   

4 vms

% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[  5] local 192.168.105.1 port 61014 connected to 192.168.105.58 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd          RTT
[  5]   0.00-1.00   sec  99.8 MBytes   837 Mbits/sec    0   2.90 MBytes   26ms     
[  5]   1.00-2.00   sec  98.3 MBytes   824 Mbits/sec    0   2.53 MBytes   25ms     
[  5]   2.00-3.00   sec  98.2 MBytes   823 Mbits/sec    0   3.03 MBytes   69ms     
[  5]   3.00-4.00   sec  99.7 MBytes   836 Mbits/sec    0   3.04 MBytes   30ms     
[  5]   4.00-5.00   sec   103 MBytes   860 Mbits/sec    0   3.03 MBytes   22ms     
[  5]   5.00-6.00   sec  91.2 MBytes   765 Mbits/sec    0   3.03 MBytes   27ms     
[  5]   6.00-7.00   sec   100 MBytes   842 Mbits/sec    0   3.03 MBytes   61ms     
[  5]   7.00-8.00   sec   102 MBytes   858 Mbits/sec    0   3.04 MBytes   33ms     
[  5]   8.00-9.00   sec  98.2 MBytes   823 Mbits/sec    0   3.04 MBytes   31ms     
[  5]   9.00-10.00  sec   103 MBytes   862 Mbits/sec    0   3.04 MBytes   28ms     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   993 MBytes   833 Mbits/sec    0             sender
[  5]   0.00-10.02  sec   991 MBytes   830 Mbits/sec                  receiver

cpu usage

CPU usage: 25.28% user, 67.77% sys, 6.93% idle 

PID    COMMAND          %CPU  #TH   
49183  com.apple.Virtua 136.9 18/2  
49173  limactl          121.4 15/2  
48954  socket_vmnet     106.6 8/1   
50380  com.apple.Virtua 83.5  18/2  
50731  com.apple.Virtua 81.0  18/1  
49905  com.apple.Virtua 77.4  18/2  
50375  limactl          67.1  16/1  
50726  limactl          65.6  16/1  
49900  limactl          62.9  16/1  
0      kernel_task      39.1  561/10
53126  iperf3-darwin    13.7  1     

nirs avatar Oct 05 '24 17:10 nirs

Yes, this is a long-standing TODO https://github.com/lima-vm/socket_vmnet/blob/0b6aed916e194309bfc3f1245003a5fdc3438848/main.c#L531-L562

AkihiroSuda avatar Oct 15 '24 13:10 AkihiroSuda

@nirs can you point to the code where the copy in limactl occurs? I don't understand why there are so many copies.

tamird avatar Nov 18 '24 16:11 tamird

The pipeline

Lima:

kernel <-vmnet-> socket_vment <-unixstream-> lima <-unixgram-> vz service <-virtio-> guest

QEMU:

kernel <-vmnet-> socket_vment <-unixstream-> qemu <-virtio-> guest

Receiving a packet from a vm

This happens in the thread forwarding packets from client socket fd: https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L467

For each packet we read: https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L492

We send the packet to vmnet interface (copy 1): https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L518

and all other sockets (N-1 copies): https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L548

Receiving packet from vmnet

This happens int the vmnet handler block, called when some packets are ready on the vmnet interface: https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L283

We read multiple packets (up to 32 packets per call): https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L152

For each packet we iterate over all connection and write the packet to the connection (N copies): https://github.com/lima-vm/socket_vmnet/blob/f486d475d4842bbddfe8f66ba09f7d1cb10cfbed/main.c#L191

Additional copies in lima

Each packet read from VZ is copied to the socket_vment socket via a socketpair: https://github.com/lima-vm/lima/blob/1f0113c2b0ecd5b21a5c84f60cb83a09ffab0dee/pkg/vz/network_darwin.go#L68

Each packet read from socket_vmnet is copied to VZ via a socketpair: https://github.com/lima-vm/lima/blob/1f0113c2b0ecd5b21a5c84f60cb83a09ffab0dee/pkg/vz/network_darwin.go#L75

This is done for every VM using lima:shared, lima:bridged, or socket - regardless of the actual packet destination.

nirs avatar Nov 18 '24 17:11 nirs