xdp-tutorial icon indicating copy to clipboard operation
xdp-tutorial copied to clipboard

advanced03-AF_XDP: How to explain this packet loss?

Open simonhf opened this issue 4 years ago • 16 comments

After experiencing AF_XDP packet loss myself, and reading about others experiencing packet loss too [1], I tried to reproduce the packet loss with barely modified XDP tutorial code.

Below I reproduce the packet loss issue (on my box) using a PCAP file I have laying around. I modify the advanced03-AP-XDP tutorial code to redirect all packets to the AF_XDP user land code which is modified to only display the RX packets every second. Then I use tcpreplay to inject packets. Mysteriously the packet loss only seems to occur if I use two or more concurrent tcpreplay processes.

This snippet shows the following:

  • A fresh veth is created.
  • af_xdp_user is started for 7 seconds with perf monitoring XDP tracepoints.
  • Two tcpreplay processes are started.
  • After the tcpreplay, ifconfig shows how many packets were sent.

Each tcpreplay sends 2,000,000 packets each, and ifconfig confirms that 4,000,000 packets were sent.

af_xdp_user shows that only 3,999,279 packets were received.

perf shows that xdp:xdp_redirect_map_err experiences small but continual packet loss, i.e. no UMEM frames available.

Is this simply the case that the tutorial default of 4,096 frames is not enough?

I tried increasing the number of frames from 4,096 to 131,072... which times 4KB is 512 MB UMEM in total. But seems to make no difference to the packet loss. Any ideas?

[1] https://stackoverflow.com/questions/60694304/af-xdp-socket-vs-linux-sockets-why-does-my-af-xdp-socket-lose-packets-whereas-a

$ git clone --recurse-submodules https://github.com/xdp-project/xdp-tutorial
$ cd xdp-tutorial/advanced03-AF_XDP/

$ cp af_xdp_kern.c af_xdp_kern.c.orig
$ vim af_xdp_kern.c # hack to pass all packets via AF_XDP instead of alternate packets
$ diff -u af_xdp_kern.c.orig af_xdp_kern.c
--- af_xdp_kern.c.orig  2020-03-26 17:00:57.675564970 -0700
+++ af_xdp_kern.c       2020-03-26 17:06:43.738565255 -0700
@@ -22,6 +22,7 @@
 int xdp_sock_prog(struct xdp_md *ctx)
 {
     int index = ctx->rx_queue_index;
+#if 0
     __u32 *pkt_count;

     pkt_count = bpf_map_lookup_elem(&xdp_stats_map, &index);
@@ -32,6 +33,7 @@
             return XDP_PASS;
     }

+#endif
     /* A set entry here means that the correspnding queue_id
      * has an active AF_XDP socket bound to it. */
     if (bpf_map_lookup_elem(&xsks_map, &index))

$ cp af_xdp_user.c af_xdp_user.c.orig
$ vim af_xdp_user.c # hack to show only received packets every second
$ diff -u af_xdp_user.c.orig af_xdp_user.c
--- af_xdp_user.c.orig  2020-03-26 17:16:54.556724110 -0700
+++ af_xdp_user.c       2020-03-26 17:18:28.431878556 -0700
@@ -457,6 +457,7 @@
               stats_rec->rx_bytes / 1000 , bps,
               period);

+#if 0
        packets = stats_rec->tx_packets - stats_prev->tx_packets;
        pps     = packets / period;

@@ -468,11 +469,12 @@
               period);

        printf("\n");
+#endif
 }

 static void *stats_poll(void *arg)
 {
-       unsigned int interval = 2;
+       unsigned int interval = 1;
        struct xsk_socket_info *xsk = arg;
        static struct stats_record previous_stats = { 0 };

$ make && time sudo sh -s << 'EOF' 2>&1
sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
timeout --signal=SIGINT 7 taskset -c 1 perf stat --interval-print 1000 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "sleep 1 ; \
taskset -c 2 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
Setup environment 'veth-advanced03' with peer ip fc00:dead:cafe:1::2 and 10.11.1.2.
...
#           time             counts unit events
     1.002084389                  1      xdp:xdp_redirect_map
     1.002084389                  0      xdp:xdp_redirect_map_err
AF_XDP RX:             1 pkts (         1 pps)           0 Kbytes (     0 Mbits/s) period:1.001182
File Cache is enabled
File Cache is enabled
Test start: 2020-03-27 15:09:48.820607 ...
Test start: 2020-03-27 15:09:48.844891 ...
     2.005563513          1,183,305      xdp:xdp_redirect_map
     2.005563513                  0      xdp:xdp_redirect_map_err
AF_XDP RX:       1219628 pkts (   1218970 pps)     1139298 Kbytes (  9109 Mbits/s) period:1.000539
     3.008055425          1,317,431      xdp:xdp_redirect_map
     3.008055425                219      xdp:xdp_redirect_map_err
AF_XDP RX:       2536784 pkts (   1316584 pps)     2368894 Kbytes (  9832 Mbits/s) period:1.000434
     4.009678941          1,319,637      xdp:xdp_redirect_map
     4.009678941                139      xdp:xdp_redirect_map_err
AF_XDP RX:       3849988 pkts (   1312237 pps)     3595255 Kbytes (  9804 Mbits/s) period:1.000737
Test complete: 2020-03-27 15:09:51.867778
Actual: 2000000 packets (1867492361 bytes) sent in 3.02 seconds
Rated: 617965705.1 Bps, 4943.72 Mbps, 661813.36 pps
Statistics for network device: veth0
        Successful packets:        2000000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
Test complete: 2020-03-27 15:09:51.903016
Actual: 2000000 packets (1867492361 bytes) sent in 3.08 seconds
Rated: 605935224.2 Bps, 4847.48 Mbps, 648929.26 pps
Statistics for network device: veth0
        Successful packets:        2000000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
     5.012111289            178,905      xdp:xdp_redirect_map
     5.012111289                  0      xdp:xdp_redirect_map_err
AF_XDP RX:       3999279 pkts (    149234 pps)     3734454 Kbytes (  1113 Mbits/s) period:1.000384
     6.014457491                  0      xdp:xdp_redirect_map
     6.014457491                  0      xdp:xdp_redirect_map_err
AF_XDP RX:       3999279 pkts (         0 pps)     3734454 Kbytes (     0 Mbits/s) period:1.000360
     6.954011130                  0      xdp:xdp_redirect_map
     6.954011130                  0      xdp:xdp_redirect_map_err
INFO: xdp_link_detach() removed XDP prog ID:451 on ifindex:221
     6.979134520                  0      xdp:xdp_redirect_map
     6.979134520                  0      xdp:xdp_redirect_map_err
- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::9cde:63ff:fec7:5856  prefixlen 64  scopeid 0x20<link>
        ether 9e:de:63:c7:58:56  txqueuelen 1000  (Ethernet)
        RX packets 4000006  bytes 3734985366 (3.7 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 31  bytes 4700 (4.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
real    0m7.379s

simonhf avatar Mar 27 '20 22:03 simonhf

@magnus-karlsson @bjoto any thoughts? :)

tohojo avatar Mar 28 '20 19:03 tohojo

When you get packet loss, it is usually because the driver has not gotten enough buffers to work with from user space. Try increasing the fill ring size and be more aggressive in putting buffers in this ring. Fill it up to its maximum. No reason to be shy :-).

magnus-karlsson avatar Mar 30 '20 06:03 magnus-karlsson

Is this simply the case that the tutorial default of 4,096 frames is not enough?

I tried increasing the number of frames from 4,096 to 131,072... which times 4KB is 512 MB UMEM in total. But seems to make no difference to the packet loss. Any ideas?

Increasing the memory for packet buffers does not change the number available for the driver to full. You might want to also increase the *___DEFAULT_NUM_DESCS values and see if it gets better.

chaudron avatar Mar 30 '20 10:03 chaudron

@magnus-karlsson Thanks very much for commenting!

"When you get packet loss, it is usually because the driver has not gotten enough buffers to work with from user space."

In this case, tcpreplay is sending the packets to 'veth-advanced03', and at the end of the reproduction steps above, ifconfig reports that all packets were received by veth-advanced03 and that there were zero "RX dropped". If the driver receive buffer were too small, wouldn't it be this ifconfig "RX dropped" counter which gets incremented?

Instead of the ifconfig "RX dropped" counter getting incremented, the above reproduction steps shows the 'xdp:xdp_redirect_map_err' counter getting incremented. Under which circumstances does this counter get incremented? I always assumed it was when there is no UMEM frame for XDP to copy into? But I'm guessing there must be other circumstances too?

And also, if the "driver has not gotten enough buffers" then why would this result in the 'xdp:xdp_redirect_map_err' counter getting incremented? And wouldn't the driver itself have its own counter to increment? And if that counter is not the ifconfig "RX dropped" counter, then where can the counter be found?

simonhf avatar Mar 30 '20 17:03 simonhf

@chaudron Thanks very much for commenting!

$ cat ../libbpf/src/xsk.h | egrep --context 3 DEFAULT_NUM_DESCS
LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);

#define XSK_RING_CONS__DEFAULT_NUM_DESCS      2048
#define XSK_RING_PROD__DEFAULT_NUM_DESCS      2048
#define XSK_UMEM__DEFAULT_FRAME_SHIFT    12 /* 4096 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SIZE     (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
#define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0

So I found two *___DEFAULT_NUM_DESCS values in xsk.h in libbpf. Is the only way to change them to hack the hard-coded value directly in libbpf in xsk.h ? Or can I just ignore those values and use different values in af_xdp_user.c ? And do you know what the max value would be?

I tried doubling the value from 2,048 to 4,096 and interesting the xdp:xdp_redirect_map_err then stayed at zero.

BUT... I noticed that there seemed to be two problems in the reproduction steps above. 1. The non-zero xdp:xdp_redirect_map_err counts. And, 2. The non-zero xdp:xdp_redirect_map_err do not add up to the total missing packets.

The math: ifconfig says that 4,000,006 packets were received. af_xdp_user says that 3,999,279 packets were received. Which is 727 missing packets. However, perf xdp:xdp_redirect_map_err missing packets only account for 219 + 139 = 358 packets. How to account for the remaining 727 - 358 = 369 packets? Which counter might be counter those? @magnus-karlsson ?

Likewise, if I -- as per your suggestion -- compile and run with the 4,096 sized *___DEFAULT_NUM_DESCS values in xsk.h in libbpf, then I get the following output below. ifconfig says it received 4,000,007 packets but af_xdp_user says it only received 3,999,639 packets. xdp:xdp_redirect_map_err shows no non-zero values, but still 368 packets are missing...

$ make clean ; make && time sudo sh -s << 'EOF' 2>&1
sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
timeout --signal=SIGINT 7 taskset -c 1 perf stat --interval-print 1000 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "sleep 1 ; \
taskset -c 2 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03 ; ethtool -S veth-advanced03
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
#           time             counts unit events
     1.001704336                  1      xdp:xdp_redirect_map                                        
     1.001704336                  0      xdp:xdp_redirect_map_err                                    
File Cache is enabled
File Cache is enabled
Test start: 2020-03-30 12:45:41.372698 ...
Test start: 2020-03-30 12:45:41.404251 ...
AF_XDP RX:         35788 pkts (     35757 pps)       33847 Kbytes (   271 Mbits/s) period:1.000868
     2.004805352          1,212,234      xdp:xdp_redirect_map                                        
     2.004805352                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:       1379527 pkts (   1343038 pps)     1288472 Kbytes ( 10032 Mbits/s) period:1.000522
     3.007149079          1,350,473      xdp:xdp_redirect_map                                        
     3.007149079                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:       2721044 pkts (   1341102 pps)     2540617 Kbytes ( 10014 Mbits/s) period:1.000309
     4.009439933          1,305,987      xdp:xdp_redirect_map                                        
     4.009439933                  0      xdp:xdp_redirect_map_err                                    
Test complete: 2020-03-30 12:45:44.391110
Actual: 2000000 packets (1867492361 bytes) sent in 3.01 seconds
Rated: 618784745.1 Bps, 4950.27 Mbps, 662690.52 pps
Statistics for network device: veth0
        Successful packets:        2000000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
Test complete: 2020-03-30 12:45:44.433027
Actual: 2000000 packets (1867492361 bytes) sent in 3.02 seconds
Rated: 616741202.4 Bps, 4933.92 Mbps, 660501.98 pps
Statistics for network device: veth0
        Successful packets:        2000000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:       3999639 pkts (   1278177 pps)     3734831 Kbytes (  9551 Mbits/s) period:1.000327
     5.011754843            130,944      xdp:xdp_redirect_map                                        
     5.011754843                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:       3999639 pkts (         0 pps)     3734831 Kbytes (     0 Mbits/s) period:1.000329
     6.014121650                  0      xdp:xdp_redirect_map                                        
     6.014121650                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:       3999639 pkts (         0 pps)     3734831 Kbytes (     0 Mbits/s) period:1.000322
     6.956703411                  0      xdp:xdp_redirect_map                                        
     6.956703411                  0      xdp:xdp_redirect_map_err                                    
INFO: xdp_link_detach() removed XDP prog ID:591 on ifindex:271
     6.993699159                  0      xdp:xdp_redirect_map                                        
     6.993699159                  0      xdp:xdp_redirect_map_err                                    
- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::c0b3:c5ff:fe40:a3a8  prefixlen 64  scopeid 0x20<link>
        ether c2:b3:c5:40:a3:a8  txqueuelen 1000  (Ethernet)
        RX packets 4000007  bytes 3734985456 (3.7 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 31  bytes 5468 (5.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NIC statistics:
     peer_ifindex: 270
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0

real    0m7.396s

simonhf avatar Mar 30 '20 20:03 simonhf

On Mon, Mar 30, 2020 at 10:01 PM Simon Hardy-Francis < [email protected]> wrote:

@chaudron https://github.com/chaudron Thanks very much for commenting!

$ cat ../libbpf/src/xsk.h | egrep --context 3 DEFAULT_NUM_DESCS LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem); LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);

#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048 #define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048 #define XSK_UMEM__DEFAULT_FRAME_SHIFT 12 /* 4096 bytes */ #define XSK_UMEM__DEFAULT_FRAME_SIZE (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT) #define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0

So I found two *___DEFAULT_NUM_DESCS values in xsk.h in libbpf. Is the only way to change them to hack the hard-coded value directly in libbpf in xsk.h ? Or can I just ignore those values and use different values in af_xdp_user.c ? And do you know what the max value would be?

No reason to hack the xsk.h file. Fill in the config struct that can be supplied with the __create() calls. The size has to be a power of 2 and the max value is large and architecture dependent.

I tried doubling the value from 2,048 to 4,096 and interesting the xdp:xdp_redirect_map_err then stayed at zero.

Perfect!

BUT... I noticed that there seemed to be two problems in the reproduction steps above. 1. The non-zero xdp:xdp_redirect_map_err counts. And, 2. The non-zero xdp:xdp_redirect_map_err do not add up to the total missing packets.

The math: ifconfig says that 4,000,006 packets were received. af_xdp_user says that 3,999,279 packets were received. Which is 727 missing packets. However, perf xdp:xdp_redirect_map_err missing packets only account for 219

  • 139 = 358 packets. How to account for the remaining 727 - 358 = 369 packets? Which counter might be counter those? @magnus-karlsson https://github.com/magnus-karlsson ?

Good question, but unfortunately I do not know the answer. Can veth drop it and account for this in some way? In any way,

I will add a new work item in my backlog to expose some generic stat (using the existing getsockopt) around missed allocations due to an empty fill ring. If we had that, you would have seen the problem right away. It is clearly missing.

/Magnus

Likewise, if I -- as per your suggestion -- compile and run with the 4,096 sized *___DEFAULT_NUM_DESCS values in xsk.h in libbpf, then I get the following output below. ifconfig says it received 4,000,007 packets but af_xdp_user says it only received 3,999,639 packets. xdp:xdp_redirect_map_err shows no non-zero values, but still 368 packets are missing...

$ make clean ; make && time sudo sh -s << 'EOF' 2>&1 sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03 timeout --signal=SIGINT 7 taskset -c 1 perf stat --interval-print 1000 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$! ip netns exec veth-advanced03 sh -c "sleep 1 ;
taskset -c 2 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace &
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 2000000 --preload-pcap /home/simon/web-browser-toolbars.trace &
" wait $PIDS ; echo "- af_xdp_user process ended" ifconfig veth-advanced03 ; ethtool -S veth-advanced03 EOF Tearing down environment 'veth-advanced03' Setting up new environment 'veth-advanced03' ...

time counts unit events

 1.001704336                  1      xdp:xdp_redirect_map
 1.001704336                  0      xdp:xdp_redirect_map_err

File Cache is enabled File Cache is enabled Test start: 2020-03-30 12:45:41.372698 ... Test start: 2020-03-30 12:45:41.404251 ... AF_XDP RX: 35788 pkts ( 35757 pps) 33847 Kbytes ( 271 Mbits/s) period:1.000868 2.004805352 1,212,234 xdp:xdp_redirect_map 2.004805352 0 xdp:xdp_redirect_map_err AF_XDP RX: 1379527 pkts ( 1343038 pps) 1288472 Kbytes ( 10032 Mbits/s) period:1.000522 3.007149079 1,350,473 xdp:xdp_redirect_map 3.007149079 0 xdp:xdp_redirect_map_err AF_XDP RX: 2721044 pkts ( 1341102 pps) 2540617 Kbytes ( 10014 Mbits/s) period:1.000309 4.009439933 1,305,987 xdp:xdp_redirect_map 4.009439933 0 xdp:xdp_redirect_map_err Test complete: 2020-03-30 12:45:44.391110 Actual: 2000000 packets (1867492361 bytes) sent in 3.01 seconds Rated: 618784745.1 Bps, 4950.27 Mbps, 662690.52 pps Statistics for network device: veth0 Successful packets: 2000000 Failed packets: 0 Truncated packets: 0 Retried packets (ENOBUFS): 0 Retried packets (EAGAIN): 0 Test complete: 2020-03-30 12:45:44.433027 Actual: 2000000 packets (1867492361 bytes) sent in 3.02 seconds Rated: 616741202.4 Bps, 4933.92 Mbps, 660501.98 pps Statistics for network device: veth0 Successful packets: 2000000 Failed packets: 0 Truncated packets: 0 Retried packets (ENOBUFS): 0 Retried packets (EAGAIN): 0 AF_XDP RX: 3999639 pkts ( 1278177 pps) 3734831 Kbytes ( 9551 Mbits/s) period:1.000327 5.011754843 130,944 xdp:xdp_redirect_map 5.011754843 0 xdp:xdp_redirect_map_err AF_XDP RX: 3999639 pkts ( 0 pps) 3734831 Kbytes ( 0 Mbits/s) period:1.000329 6.014121650 0 xdp:xdp_redirect_map 6.014121650 0 xdp:xdp_redirect_map_err AF_XDP RX: 3999639 pkts ( 0 pps) 3734831 Kbytes ( 0 Mbits/s) period:1.000322 6.956703411 0 xdp:xdp_redirect_map 6.956703411 0 xdp:xdp_redirect_map_err INFO: xdp_link_detach() removed XDP prog ID:591 on ifindex:271 6.993699159 0 xdp:xdp_redirect_map 6.993699159 0 xdp:xdp_redirect_map_err

  • af_xdp_user process ended veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.11.1.1 netmask 255.255.255.0 broadcast 0.0.0.0 inet6 fc00:dead:cafe:1::1 prefixlen 64 scopeid 0x0 inet6 fe80::c0b3:c5ff:fe40:a3a8 prefixlen 64 scopeid 0x20 ether c2:b3:c5:40:a3:a8 txqueuelen 1000 (Ethernet) RX packets 4000007 bytes 3734985456 (3.7 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 31 bytes 5468 (5.4 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

NIC statistics: peer_ifindex: 270 rx_queue_0_xdp_packets: 0 rx_queue_0_xdp_bytes: 0 rx_queue_0_xdp_drops: 0

real 0m7.396s

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/xdp-project/xdp-tutorial/issues/116#issuecomment-606214689, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASGUEMAH3TI7YXJUTORVXTRKD3C7ANCNFSM4LVKRTGQ .

magnus-karlsson avatar Mar 31 '20 05:03 magnus-karlsson

I wrote the original stackoverflow-question and also this kernel-mail question: https://marc.info/?l=xdp-newbies&m=158472686211835&w=2

This is my xdp_create_socket-function:

	struct xsk_socket_config xsk_socket_cfg;
 	struct xsk_socket_info *xsk;
	struct xsk_ring_cons *rxr;
	struct xsk_ring_prod *txr;
	int ret;

	xsk = calloc(1, sizeof(*xsk));
	if (!xsk) {
		fprintf(stderr, "xsk `calloc` failed: %s\n", strerror(errno));
		exit(1);
	}

	xsk->umem = umem;
	xsk_socket_cfg.rx_size = XSK_CONS_AMOUNT;
	xsk_socket_cfg.tx_size = XSK_PROD_AMOUNT;
	xsk_socket_cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;

	xsk_socket_cfg.xdp_flags = cfg->xdp_flags | XDP_FLAGS_DRV_MODE;
	xsk_socket_cfg.bind_flags = cfg->xsk_bind_flags | XDP_USE_NEED_WAKEUP | XDP_ZEROCOPY;

	rxr = rx ? &xsk->rx : NULL;
	txr = tx ? &xsk->tx : NULL;
	ret = xsk_socket__create(&xsk->xsk, cfg->ifname_buf, cfg->xsk_if_queue, umem->umem, rxr, txr, &xsk_socket_cfg);
	if (ret) {
		fprintf(stderr, "`xsk_socket__create` returned error: %s\n", strerror(errno));
		exit(-ret);
	}

	return xsk;

and xsk_configure_umem:

	struct xsk_umem_info *umem;
	struct xsk_umem_config cfg = {
		.fill_size = XSK_PROD_AMOUNT,
		.comp_size = XSK_CONS_AMOUNT,
		.frame_size = FRAME_SIZE,
		.frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM,
		.flags = 0
	};

	umem = calloc(1, sizeof(*umem));
	if (!umem) {
		fprintf(stderr, "Error while allocating umem: %s\n", strerror(errno));
		exit(1);
	}

	const int ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, &cfg);
	if (ret) {
		fprintf(stderr, "`xsk_umem__create` returned: %s\n", strerror(-ret));
		exit(1);
	}

	umem->buffer = buffer;
	return umem;

With

#define XSK_CONS_AMOUNT			32768
#define XSK_PROD_AMOUNT			32768

In my kernel-mail question, I observed that packets smaller than 400 bytes (372 bytes payload) are not received by the AF-XDP kernel program. I also used nping wit a rate of 250 packets per second. I am sorry, but how would 250 packets per second be a problem?

Is someone able to reproduce my example using a very basic XDP kernel program which just prints out the length of the received packet and nping:

nping --udp -p <port> --dest-ip <ip> --data-length 371 -c 50000000 --rate 250 -N

to ping the interface the XDP program is loaded on with UDP packets?

Bresenham avatar Mar 31 '20 13:03 Bresenham

"Is someone able to reproduce my example using a very basic XDP kernel program"

@Bresenham: Using the same steps as detailed above (i.e. the standard tutorial XDP kernel code which passes all packets to AF_XDP, and the af_xdp_user example which simply reports how many packets were read per second) and using a modified version of your nping line which works with the veth interface (I had to use the options --dest-mac and --dest-eth otherwise nping says the packets are undeliverable because the test runs everything locally via veth...), then I could reproduce using the one-liner below.

Your nping line said count 50,000,000 at 250 per second... which is about 55 hours which is too long for me... so I reduced the count to 2,500 which runs it for a few seconds. You can play around with that on your own system...

In the example below then nping says it sent 2,500 packets. tcpdump says it received 2,500 nping packets. ifconfig says it received 2,510 packets. And af_xdp_user says it received 2,504 packets. Obviously there's a few extra packets floating around. But it doesn't look like many 100s of packets are going missing like in my example above in this issue.

I did notice that although the nping rate is set to 250 per second, both perf xdp and af_xdp_user only report about 210-ish per second. I've never used nping before, and although it has an impressive number of options, this rate discrepancy makes me wonder about the quality of the nping code. But maybe it's designed for higher PPS?

I also tried running nping at a higher PPS of 1M but it only managed to send about 180k PPS. So I guess it's not designed for higher PPS either. Sending at the max rate for nping also didn't result in packet loss. As a comparison, on my laptop, sending with tcpreplay and the preload option, then I get about 800k PPS... which I thought was on the slow side... so 180k PPS for nping seems really slow...

Sorry I couldn't reproduce your packet loss described. But on the plus side, you could now compare your code with the tutorial code working with nping in order to figure out where the packet loss issue is? Or you could test our your code on a veth and see if the packet loss is somehow related to testing on a real NIC? Hope this helps!

$ make && time sudo sh -s << 'EOF' 2>&1
../testenv/testenv.sh teardown --name veth-advanced03 ; ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
timeout --signal=SIGINT 20 taskset -c 1 perf stat --interval-print 1000 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c " \
tcpdump -i veth0 -w veth-advanced03.pcap & \
sleep 1 ; \
taskset -c 2 nping -v-1 --udp --dest-port 12345 --dest-ip 1.2.3.4 --dest-mac 00:01:02:03:04:05 --send-eth --data-length 371 --count 2500 --rate 250 --no-capture & \
"
wait $PIDS ; echo "- af_xdp_user process ended"
sleep 1 ; pkill tcpdump ; sleep 1 ; echo "- tcpdump captured packets from nping" ; tcpdump -n -r veth-advanced03.pcap | egrep "1.2.3.4.12345" | wc -l
ifconfig veth-advanced03 ; ethtool -S veth-advanced03
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
tcpdump: listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes

Starting Nping 0.7.80 ( https://nmap.org/nping ) at 2020-03-31 16:28 PDT
#           time             counts unit events
     1.001499226                  1      xdp:xdp_redirect_map                                        
     1.001499226                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:             3 pkts (         3 pps)           0 Kbytes (     0 Mbits/s) period:1.000325
     2.008877312                 40      xdp:xdp_redirect_map                                        
     2.008877312                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:            67 pkts (        64 pps)          26 Kbytes (     0 Mbits/s) period:1.000353
     3.012104284                211      xdp:xdp_redirect_map                                        
     3.012104284                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:           277 pkts (       210 pps)         112 Kbytes (     1 Mbits/s) period:1.000289
     4.015470826                209      xdp:xdp_redirect_map                                        
     4.015470826                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:           486 pkts (       209 pps)         199 Kbytes (     1 Mbits/s) period:1.000324
     5.017740755                212      xdp:xdp_redirect_map                                        
     5.017740755                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:           697 pkts (       211 pps)         286 Kbytes (     1 Mbits/s) period:1.000339
     6.020110484                212      xdp:xdp_redirect_map                                        
     6.020110484                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:           910 pkts (       213 pps)         374 Kbytes (     1 Mbits/s) period:1.000356
     7.022473657                211      xdp:xdp_redirect_map                                        
     7.022473657                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          1120 pkts (       210 pps)         461 Kbytes (     1 Mbits/s) period:1.000326
     8.024771194                212      xdp:xdp_redirect_map                                        
     8.024771194                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          1331 pkts (       211 pps)         548 Kbytes (     1 Mbits/s) period:1.000332
     9.027071800                210      xdp:xdp_redirect_map                                        
     9.027071800                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          1540 pkts (       209 pps)         634 Kbytes (     1 Mbits/s) period:1.000310
    10.029381957                210      xdp:xdp_redirect_map                                        
    10.029381957                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          1750 pkts (       210 pps)         721 Kbytes (     1 Mbits/s) period:1.000298
    11.031720519                211      xdp:xdp_redirect_map                                        
    11.031720519                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          1960 pkts (       210 pps)         808 Kbytes (     1 Mbits/s) period:1.000340
    12.034996404                210      xdp:xdp_redirect_map                                        
    12.034996404                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2171 pkts (       211 pps)         895 Kbytes (     1 Mbits/s) period:1.000390
    13.037346539                210      xdp:xdp_redirect_map                                        
    13.037346539                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2380 pkts (       209 pps)         981 Kbytes (     1 Mbits/s) period:1.000293
Max rtt: N/A | Min rtt: N/A | Avg rtt: N/A
Raw packets sent: 2500 (1.032MB) | Rcvd: 0 (0B) | Lost: 2500 (100.00%)
Nping done: 1 IP address pinged in 12.76 seconds
    14.039874547                145      xdp:xdp_redirect_map                                        
    14.039874547                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (       124 pps)        1032 Kbytes (     0 Mbits/s) period:1.000520
    15.042196027                  0      xdp:xdp_redirect_map                                        
    15.042196027                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (         0 pps)        1032 Kbytes (     0 Mbits/s) period:1.000324
    16.044502744                  0      xdp:xdp_redirect_map                                        
    16.044502744                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (         0 pps)        1032 Kbytes (     0 Mbits/s) period:1.000327
    17.047826959                  0      xdp:xdp_redirect_map                                        
    17.047826959                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (         0 pps)        1032 Kbytes (     0 Mbits/s) period:1.000328
    18.050143899                  0      xdp:xdp_redirect_map                                        
    18.050143899                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (         0 pps)        1032 Kbytes (     0 Mbits/s) period:1.000283
    19.052431510                  0      xdp:xdp_redirect_map                                        
    19.052431510                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          2504 pkts (         0 pps)        1032 Kbytes (     0 Mbits/s) period:1.000371
    19.954678525                  0      xdp:xdp_redirect_map                                        
    19.954678525                  0      xdp:xdp_redirect_map_err                                    
INFO: xdp_link_detach() removed XDP prog ID:799 on ifindex:359
    19.986299157                  0      xdp:xdp_redirect_map                                        
    19.986299157                  0      xdp:xdp_redirect_map_err                                    
- af_xdp_user process ended
2526 packets captured
2526 packets received by filter
0 packets dropped by kernel
- tcpdump captured packets from nping
reading from file veth-advanced03.pcap, link-type EN10MB (Ethernet)
2500
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::d414:ff:fe96:db2b  prefixlen 64  scopeid 0x20<link>
        ether d6:14:00:96:db:2b  txqueuelen 1000  (Ethernet)
        RX packets 2510  bytes 1033360 (1.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 33  bytes 5854 (5.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NIC statistics:
     peer_ifindex: 358
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0

real    0m22.409s

simonhf avatar Mar 31 '20 23:03 simonhf

@magnus-karlsson thanks for the comments!

Good question, but unfortunately I do not know the answer. Can veth drop it and account for this in some way? In any way,

I managed to simplify the packet loss example using veth down to sending just 8,000 packets.

In the example below 8,000 packets are sent but af_xdp_user only says it received 7,644 packets.

I increased the veth txqueuelen to 10,000 from the default 1,000... but not sure if it's actually used or makes a difference.

Both ifconfig and tcpdump see the 8,000 packets from tcpreplay.

I have searched high and low for tracepoints for veth which might show packet loss but cannot find anything.

It's interesting that the amount of packets lost seems to be similar whether sending 8,000 or 4M packets... so does that indicate that there might be some kind of initialization bug where the packet loss only happens very early on?

If I reduce the number of packets further then the packet loss drops down from 350-ish to 200-ish...

@magnus-karlsson Any ideas what else I can try? How would you debug this further?

$ make && time sudo sh -s << 'EOF' 2>&1
../testenv/testenv.sh teardown --name veth-advanced03 ; ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
ip link set veth-advanced03 txqueuelen 10000
timeout --signal=SIGINT 3 taskset -c 1 perf stat --interval-print 500 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "
tcpdump -i veth0 -w veth-advanced03.pcap & \
sleep 1 ; \
taskset -c 2 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 4000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 4000 --preload-pcap /home/simon/web-browser-toolbars.trace & \
"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03 ; ethtool -S veth-advanced03
sleep 1 ; pkill tcpdump ; sleep 1 ; echo "- tcpdump captured packets" ; tcpdump -n -r veth-advanced03.pcap | wc -l
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
tcpdump: listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
#           time             counts unit events
     0.502066753                  0      xdp:xdp_redirect_map                                        
     0.502066753                  0      xdp:xdp_redirect_map_err                                    
     1.004742302                  0      xdp:xdp_redirect_map                                        
     1.004742302                  0      xdp:xdp_redirect_map_err                                    
File Cache is enabled
File Cache is enabled
Test start: 2020-03-31 18:19:42.551781 ...
Test start: 2020-03-31 18:19:42.559246 ...
Test complete: 2020-03-31 18:19:42.562535
Actual: 4000 packets (3675306 bytes) sent in 0.010754 seconds
Rated: 341761763.0 Bps, 2734.09 Mbps, 371954.62 pps
Statistics for network device: veth0
        Successful packets:        4000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
Test complete: 2020-03-31 18:19:42.568619
Actual: 4000 packets (3675306 bytes) sent in 0.009373 seconds
Rated: 392116291.4 Bps, 3136.93 Mbps, 426757.70 pps
Statistics for network device: veth0
        Successful packets:        4000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:          7644 pkts (      7639 pps)        7199 Kbytes (    58 Mbits/s) period:1.000631
     1.507363877              7,644      xdp:xdp_redirect_map                                        
     1.507363877                  0      xdp:xdp_redirect_map_err                                    
     2.009021565                  0      xdp:xdp_redirect_map                                        
     2.009021565                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:          7644 pkts (         0 pps)        7199 Kbytes (     0 Mbits/s) period:1.000248
     2.511711925                  0      xdp:xdp_redirect_map                                        
     2.511711925                  0      xdp:xdp_redirect_map_err                                    
     2.953628049                  0      xdp:xdp_redirect_map                                        
     2.953628049                  0      xdp:xdp_redirect_map_err                                    
INFO: xdp_link_detach() removed XDP prog ID:959 on ifindex:439
     2.987439874                  0      xdp:xdp_redirect_map                                        
     2.987439874                  0      xdp:xdp_redirect_map_err                                    
- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::6015:9fff:fe78:af23  prefixlen 64  scopeid 0x20<link>
        ether 62:15:9f:78:af:23  txqueuelen 10000  (Ethernet)
        RX packets 8006  bytes 7351256 (7.3 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27  bytes 4654 (4.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NIC statistics:
     peer_ifindex: 438
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0
8019 packets captured
8019 packets received by filter
0 packets dropped by kernel
- tcpdump captured packets
reading from file veth-advanced03.pcap, link-type EN10MB (Ethernet)
8019

real    0m5.410s

simonhf avatar Apr 01 '20 01:04 simonhf

Thank you so much simon. I tracked down the problem to my packet-parsing. Can someone just take a quick glimpse and tell me what is wrong with this parsing?

SEC("xdp_sock")
int xdp_sock_prog(struct xdp_md *ctx) {

    const void *data = (void*)(unsigned long)ctx->data;
    const void *data_end = (void*)(unsigned long)ctx->data_end;

    const struct ethhdr *eth = (struct ethhdr*)(data);

    const int len = data_end - data;

    bpf_printk("STAGE 0 LEN: %d\n", len);

    if(eth + sizeof(struct ethhdr) <= data_end) {

        bpf_printk("\t->STAGE 1\n");

        if(bpf_ntohs(eth->h_proto) == ETH_P_IP) {
            const struct iphdr *iph = (struct iphdr*)(data + sizeof(struct ethhdr));

            const int remaining = data_end - (void*)(iph + sizeof(struct iphdr));
            bpf_printk("\t->STAGE 2 LEN IPHDR: %d\n", remaining);

            if(iph + sizeof(struct iphdr) <= data_end) {
                
                bpf_printk("\t->STAGE 3\n");

Because for nping --data-length 371, the variable remaining drops to -1 which leads to the following if to be false and therefore not passing the packet to user-space? How can that be though? Not enough space for IP-Header?

Bresenham avatar Apr 01 '20 08:04 Bresenham

Maximilian Gaul [email protected] writes:

Thank you so much simon. I tracked down the problem to my packet-parsing. Can someone just take a quick glimpse and tell me what is wrong with this parsing?

SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) {

    const void *data = (void*)(unsigned long)ctx->data;
    const void *data_end = (void*)(unsigned long)ctx->data_end;

    const struct ethhdr *eth = (struct ethhdr*)(data);

    const int len = data_end - data;

    bpf_printk("STAGE 0 LEN: %d\n", len);

    if(eth + sizeof(struct ethhdr) <= data_end) {

        bpf_printk("\t->STAGE 1\n");

        if(bpf_ntohs(eth->h_proto) == ETH_P_IP) {
            const struct iphdr *iph = (struct iphdr*)(data + sizeof(struct ethhdr));

            const int remaining = data_end - (void*)(iph + sizeof(struct iphdr));
            bpf_printk("\t->STAGE 2 LEN IPHDR: %d\n", remaining);

            if(iph + sizeof(struct iphdr) <= data_end) {
                
                bpf_printk("\t->STAGE 3\n");

Because for nping --data-length 371, the variable remaining drops to -1? How can that be?

I think it's this line:

        const int remaining = data_end - (void*)(iph + sizeof(struct iphdr));

The cast to void is outside the parenthesis, so you're doing pointer arithmetic with an 'iph'-sized pointer. I.e., you'll be advancing it 400 bytes (20 * 20) instead of 20.

So change that to either

const int remaining = data_end - ((void *)iph + sizeof(struct iphdr));

or just

const int remaining = data_end - (void*)(iph + 1);

tohojo avatar Apr 01 '20 09:04 tohojo

Oh my god! Thank you!

Bresenham avatar Apr 01 '20 09:04 Bresenham

Maximilian Gaul [email protected] writes:

Oh my god! Thank you!

You're welcome :)

tohojo avatar Apr 01 '20 09:04 tohojo

@magnus-karlsson I tried out some more things:

Even concurrently tcpreplaying the first 150 packets of two pcap files provokes the bug. Please see below.

I also used unique pcap files for each tcpreplay session in case using the same pcap file caused some kind of pathological condition. But whether the pcap files are different or the same, I get the same results.

In the example below, 2x 150 packets are tcpreplayed but af_xdp_user only sees 180 of them, meaning that 300 - 180 = 120 packets go missing between veth and XDP. But ifconfig and tcpdump see all of them in the veth namespace.

In addition, after tcpreplaying the two sets of 150 packets, 100k packets from a single tcpreplay is injected, and then an additional two sets of 150 packets. So 100,600 packets are expected in total, and that's what both ifconfig and tcpdump report. However, af_xdp_user always fails to see 100s of packets whenever concurrent tcpreplay sessions are started, regardless of how many packets have been seen before.

Since the total packets to cause this are less than 300, it seems like this must be some type of concurrency bug, or? Looking forward to your thoughts on how to debug this further.

$ make && time sudo sh -s << 'EOF' 2>&1
../testenv/testenv.sh teardown --name veth-advanced03 ; ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
ip link set veth-advanced03 txqueuelen 10000
timeout --signal=SIGINT 8 taskset -c 1 perf stat --interval-print 500 --all-cpus --event='xdp:xdp_redirect_map*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "
taskset -c 2 tcpdump -i veth0 -w veth-advanced03.pcap & \
sleep 1 ; \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 150 --preload-pcap ./test.trace1 & \
taskset -c 4 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 150 --preload-pcap ./test.trace2 & \
sleep 2 ; \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 100000 --preload-pcap ./test.trace3 & \
sleep 2 ; \
taskset -c 3 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 150 --preload-pcap ./test.trace4 & \
taskset -c 4 tcpreplay -i veth0 --topspeed --stats=10 --loop 0 --limit 150 --preload-pcap ./test.trace5 & \
"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03 ; ethtool -S veth-advanced03
sleep 1 ; pkill tcpdump ; sleep 1 ; echo "- tcpdump captured packets" ; tcpdump -n -r veth-advanced03.pcap | wc -l
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
tcpdump: listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
- af_xdp_user entering while loop for read
#           time             counts unit events
     0.501447814                  1      xdp:xdp_redirect_map                                        
     0.501447814                  0      xdp:xdp_redirect_map_err                                    
     1.005998560                  0      xdp:xdp_redirect_map                                        
     1.005998560                  0      xdp:xdp_redirect_map_err                                    
File Cache is enabled
Test start: 2020-04-01 14:18:57.351094 ...
Test complete: 2020-04-01 14:18:57.351401
Actual: 150 packets (152056 bytes) sent in 0.000307 seconds
Rated: 495296416.9 Bps, 3962.37 Mbps, 488599.34 pps
Statistics for network device: veth0
        Successful packets:        150
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
File Cache is enabled
Test start: 2020-04-01 14:18:57.375903 ...
Test complete: 2020-04-01 14:18:57.376475
Actual: 150 packets (158994 bytes) sent in 0.000572 seconds
Rated: 277961538.4 Bps, 2223.69 Mbps, 262237.76 pps
Statistics for network device: veth0
        Successful packets:        150
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:           180 pkts (       180 pps)         189 Kbytes (     2 Mbits/s) period:1.001102
     1.510657935                181      xdp:xdp_redirect_map                                        
     1.510657935                  0      xdp:xdp_redirect_map_err                                    
     2.013492422                  0      xdp:xdp_redirect_map                                        
     2.013492422                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:           182 pkts (         2 pps)         190 Kbytes (     0 Mbits/s) period:1.000410
     2.517108882                  0      xdp:xdp_redirect_map                                        
     2.517108882                  0      xdp:xdp_redirect_map_err                                    
     3.019483118                  0      xdp:xdp_redirect_map                                        
     3.019483118                  0      xdp:xdp_redirect_map_err                                    
File Cache is enabled
Test start: 2020-04-01 14:18:59.363262 ...
AF_XDP RX:         35420 pkts (     35198 pps)       35881 Kbytes (   285 Mbits/s) period:1.001123
Test complete: 2020-04-01 14:18:59.565788
Actual: 100000 packets (101274431 bytes) sent in 0.202526 seconds
Rated: 500056442.1 Bps, 4000.45 Mbps, 493763.76 pps
Statistics for network device: veth0
        Successful packets:        100000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
     3.522168561            100,000      xdp:xdp_redirect_map                                        
     3.522168561                  0      xdp:xdp_redirect_map_err                                    
     4.024844579                  0      xdp:xdp_redirect_map                                        
     4.024844579                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:        100182 pkts (     64739 pps)      101464 Kbytes (   524 Mbits/s) period:1.000357
     4.528459970                  0      xdp:xdp_redirect_map                                        
     4.528459970                  0      xdp:xdp_redirect_map_err                                    
     5.031112673                  0      xdp:xdp_redirect_map                                        
     5.031112673                  0      xdp:xdp_redirect_map_err                                    
File Cache is enabled
Test start: 2020-04-01 14:19:01.363636 ...
Test complete: 2020-04-01 14:19:01.363951
Actual: 150 packets (155390 bytes) sent in 0.000315 seconds
Rated: 493301587.3 Bps, 3946.41 Mbps, 476190.47 pps
Statistics for network device: veth0
        Successful packets:        150
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
File Cache is enabled
Test start: 2020-04-01 14:19:01.385617 ...
Test complete: 2020-04-01 14:19:01.386261
Actual: 150 packets (152640 bytes) sent in 0.000644 seconds
Rated: 237018633.5 Bps, 1896.14 Mbps, 232919.25 pps
Statistics for network device: veth0
        Successful packets:        150
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:        100365 pkts (       183 pps)      101650 Kbytes (     1 Mbits/s) period:1.000334
     5.533809128                183      xdp:xdp_redirect_map                                        
     5.533809128                  0      xdp:xdp_redirect_map_err                                    
     6.036487028                  0      xdp:xdp_redirect_map                                        
     6.036487028                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:        100365 pkts (         0 pps)      101650 Kbytes (     0 Mbits/s) period:1.000322
     6.539349738                  0      xdp:xdp_redirect_map                                        
     6.539349738                  0      xdp:xdp_redirect_map_err                                    
     7.041976355                  0      xdp:xdp_redirect_map                                        
     7.041976355                  0      xdp:xdp_redirect_map_err                                    
AF_XDP RX:        100365 pkts (         0 pps)      101650 Kbytes (     0 Mbits/s) period:1.000498
     7.544623892                  0      xdp:xdp_redirect_map                                        
     7.544623892                  0      xdp:xdp_redirect_map_err                                    
     7.953617169                  0      xdp:xdp_redirect_map                                        
     7.953617169                  0      xdp:xdp_redirect_map_err                                    
INFO: xdp_link_detach() removed XDP prog ID:1239 on ifindex:611
     7.995917840                  0      xdp:xdp_redirect_map                                        
     7.995917840                  0      xdp:xdp_redirect_map_err                                    
- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::90bd:54ff:fe8a:cc66  prefixlen 64  scopeid 0x20<link>
        ether 92:bd:54:8a:cc:66  txqueuelen 10000  (Ethernet)
        RX packets 100607  bytes 101894245 (101.8 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 31  bytes 5468 (5.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NIC statistics:
     peer_ifindex: 610
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0
100625 packets captured
100625 packets received by filter
0 packets dropped by kernel
- tcpdump captured packets
reading from file veth-advanced03.pcap, link-type EN10MB (Ethernet)
100625

real    0m10.567s
user    0m8.004s
sys     0m0.693s

simonhf avatar Apr 01 '20 21:04 simonhf

I'm reaching out to @netoptimizer, @tmakita, and @borkmann for help with the above veth and/or XDP packet loss issue [1]. How did I come to list you three?

Well I'm running this Linux kernel:

$ uname -r
5.3.0-42-generic

I looked up the release here [2] and saw that the release date was 2019-Sep-15.

Then I looked up the last change to veth.c before that date here [3] and found your email addresses :-)

Signed-off-by: Toshiaki Makita <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

That version of veth.c [4] appears to support the 3 stats which I also see, which do not report any drops:

$ ethtool -S veth-advanced03
NIC statistics:
     peer_ifindex: 610
     rx_queue_0_xdp_packets: 0
     rx_queue_0_xdp_bytes: 0
     rx_queue_0_xdp_drops: 0

Looking forward to your comments and hope my reproduction steps help!

[1] https://github.com/xdp-project/xdp-tutorial/issues/116#issuecomment-607501798 [2] https://github.com/torvalds/linux/releases/tag/v5.3 [3] https://github.com/torvalds/linux/commit/9cda7807ee1e25a3771b5357d9fb12991b2550f9?diff=unified [4] https://github.com/torvalds/linux/blob/9cda7807ee1e25a3771b5357d9fb12991b2550f9/drivers/net/veth.c#L43

simonhf avatar Apr 01 '20 22:04 simonhf

@simonhf Why not try xdp native mode.

xd121614 avatar Nov 28 '23 07:11 xd121614