xdp-tutorial icon indicating copy to clipboard operation
xdp-tutorial copied to clipboard

advanced03-AF_XDP: how to increase packet size

Open simonhf opened this issue 4 years ago • 10 comments

I successfully managed to run the advanced03 AF_XDP assignment.

As an example, the command below shows a ping and while the XDP program is active then 4 of the 7 ping packets end up in user land as expected. I can also enable the every other packet biz logic and that works too.

$ sudo timeout --signal=SIGINT 5 ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --auto-mode --force & sudo ip netns exec veth-advanced03 sh -c "sleep 1 ; ping -c 7 -W 1 fc00:dead:cafe:b::1"
PING fc00:dead:cafe:b::1(fc00:dead:cafe:b::1) 56 data bytes
AF_XDP RX:             1 pkts (         0 pps)           0 Kbytes (     0 Mbits/s) period:2.000139
       TX:             0 pkts (         0 pps)           0 Kbytes (     0 Mbits/s) period:2.000139

AF_XDP RX:             3 pkts (         1 pps)           0 Kbytes (     0 Mbits/s) period:2.000640
       TX:             0 pkts (         0 pps)           0 Kbytes (     0 Mbits/s) period:2.000640

INFO: xdp_link_detach() removed XDP prog ID:608 on ifindex:41
64 bytes from fc00:dead:cafe:b::1: icmp_seq=5 ttl=64 time=0.040 ms
64 bytes from fc00:dead:cafe:b::1: icmp_seq=6 ttl=64 time=0.044 ms
64 bytes from fc00:dead:cafe:b::1: icmp_seq=7 ttl=64 time=0.052 ms

--- fc00:dead:cafe:b::1 ping statistics ---
7 packets transmitted, 3 received, 57.1429% packet loss, time 6140ms
rtt min/avg/max/mdev = 0.040/0.045/0.052/0.005 ms

The question is about the packet size that ends up in user land. If I increase the ping packet size with -s 3000 then the largest packet that ends up in user land is only 1,514 bytes even though the UMEM frame size as 4,096 bytes.

I tried increasing the mtu for the veth as follows but this didn't help, e.g.:

$ sudo ifconfig veth-advanced03 mtu 2000

How can I get bigger packets showing up in user land?

[1] https://github.com/xdp-project/xdp-tutorial/tree/master/advanced03-AF_XDP

simonhf avatar Mar 01 '20 01:03 simonhf

XDP doesn't support packets spanning more than one memory page. Different drivers have different exact limits, for veth there's 576 bytes of overhead. This means you should be able to get packets up to 3520 bytes through, though. I suspect maybe the problem is that you need to set the MTU on both sites of the veth link?

tohojo avatar Mar 02 '20 09:03 tohojo

@tohojo Thanks for the quick response!

What's the reason for limiting packet size to a single memory page? And are there any plans in the future to remove this limit?

I suspect maybe the problem is that you need to set the MTU on both sites of the veth link?

Thanks for the hint! I did the following to solve the issue, which may be interesting for others:

Apparently ip netns sets up a new veth interface which has a default MTU which is why the ping packet ended up smaller via AF_XDP:

$ sudo ip netns exec veth-advanced03 sh -c "ifconfig -a | egrep mtu"
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
veth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500

So modifying my original command line to set the MTU (in this example to 3,500 bytes) for veth0 inside ip netns does the trick:

$ make && sudo ifconfig veth-advanced03 mtu 3500 ; sudo timeout --signal=SIGINT 5 ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --auto-mode --force & sudo ip netns exec veth-advanced03 sh -c "(ifconfig veth0 mtu 3500 ; sleep 1 ; ping -c 7 -W 1 -s 3000 10.11.11.1)" 
PING 10.11.11.1 (10.11.11.1) 3000(3028) bytes of data.
- xsk_ring_cons__rx_desc() = 3042 bytes
AF_XDP RX:             1 pkts (         0 pps)           3 Kbytes (     0 Mbits/s) period:2.000784
       TX:             0 pkts (         0 pps)           0 Kbytes (     0 Mbits/s) period:2.000784

- xsk_ring_cons__rx_desc() = 3042 bytes
- xsk_ring_cons__rx_desc() = 3042 bytes
AF_XDP RX:             3 pkts (         1 pps)           9 Kbytes (     0 Mbits/s) period:2.000655
       TX:             0 pkts (         0 pps)           0 Kbytes (     0 Mbits/s) period:2.000655

- xsk_ring_cons__rx_desc() = 3042 bytes
INFO: xdp_link_detach() removed XDP prog ID:752 on ifindex:41
3008 bytes from 10.11.11.1: icmp_seq=5 ttl=64 time=0.084 ms
3008 bytes from 10.11.11.1: icmp_seq=6 ttl=64 time=0.119 ms
3008 bytes from 10.11.11.1: icmp_seq=7 ttl=64 time=0.118 ms

--- 10.11.11.1 ping statistics ---
7 packets transmitted, 3 received, 57.1429% packet loss, time 6141ms
rtt min/avg/max/mdev = 0.084/0.107/0.119/0.016 ms

However, trying to set the MTU to 4,000 causes an error (presumably as expected because XDP has the single memory page limitation mentioned above):

$ make && sudo ifconfig veth-advanced03 mtu 4000 ; sudo timeout --signal=SIGINT 5 ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --auto-mode --force & sudo ip netns exec veth-advanced03 sh -c "(ifconfig veth0 mtu 4000 ; sleep 1 ; ping -c 7 -W 1 -s 4000 10.11.11.1)"     
libbpf: Kernel error message: veth: Peer MTU is too large to set XDP
ERR: ifindex(41) link set xdp fd failed (34): Numerical result out of range
PING 10.11.11.1 (10.11.11.1) 4000(4028) bytes of data.
4008 bytes from 10.11.11.1: icmp_seq=1 ttl=64 time=0.193 ms
4008 bytes from 10.11.11.1: icmp_seq=2 ttl=64 time=0.058 ms
4008 bytes from 10.11.11.1: icmp_seq=3 ttl=64 time=0.140 ms
4008 bytes from 10.11.11.1: icmp_seq=4 ttl=64 time=0.160 ms
4008 bytes from 10.11.11.1: icmp_seq=5 ttl=64 time=0.145 ms
4008 bytes from 10.11.11.1: icmp_seq=6 ttl=64 time=0.081 ms
4008 bytes from 10.11.11.1: icmp_seq=7 ttl=64 time=0.123 ms

--- 10.11.11.1 ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6124ms
rtt min/avg/max/mdev = 0.058/0.128/0.193/0.042 ms

simonhf avatar Mar 02 '20 17:03 simonhf

Simon Hardy-Francis [email protected] writes:

@tohojo Thanks for the quick response!

What's the reason for limiting packet size to a single memory page? And are there any plans in the future to remove this limit?

Yes, that is being discussed upstream; not quite trivial to do while still maintaining the performance, but definitely something that will solved eventually.

tohojo avatar Mar 02 '20 17:03 tohojo

On a related note, I noticed that the advanced03 AF_XDP tutorial divides the UMEM into single memory page sized elements of 4,096 bytes on my box, as you suggested with the single memory page per packet. However, since the regular non-jumbo MTU is typically 1,500 bytes, does this mean that at least half of the typical AF_XDP program UMEM allocated will never be used? Or is there a way to store e.g. 2 packets per memory page somehow?

On a related note again, if for some reason a packet must be stored in a single memory page, is it possible, or is there any benefit to configuring XDP with huge pages?

simonhf avatar Mar 02 '20 18:03 simonhf

Simon Hardy-Francis [email protected] writes:

On a related note, I noticed that the advanced03 AF_XDP tutorial divides the UMEM into single memory page sized elements of 4,096 bytes on my box, as you suggested with the single memory page per packet. However, since the regular non-jumbo MTU is typically 1,500 bytes, does this mean that at least half of the typical AF_XDP program UMEM allocated will never be used? Or is there a way to store e.g. 2 packets per memory page somehow?

There are hardware driver(s) that do divide up the page (Intel, I think). Not sure how that works with AF_XDP, actually (if at all). And yeah, it does waste a bit of memory, but it means the code can take a lot of shortcuts, so it is both simpler and sometimes also slightly faster. So I wouldn't worry too much about it; memory is cheap! ;)

On a related note again, if for some reason a packet must be stored in a single memory page, is it possible, or is there any benefit to configuring XDP with huge pages?

I don't think you can, and you probably wouldn't benefit from it, no...

tohojo avatar Mar 02 '20 21:03 tohojo

We were testing half-page frames with AF_XDP on Intel recently, and it seemed to just work. We did no special changes, just passed the different numbers to xsk_umem__create().

vcunat avatar Mar 13 '20 10:03 vcunat

Hi @tohojo. I realize that for XDP to support jumbo frames with multi-buffers on this design proposal there are memory allocation and layout challenges to be addressed. Do you think that support for jumbo frames has any chance to potentially land in 2021 yet? Thank you

viniarck avatar Apr 03 '21 16:04 viniarck

I think @netoptimizer or @LorenzoBianconi would know that better than me :)

tohojo avatar Apr 03 '21 18:04 tohojo

Hi @tohojo. I realize that for XDP to support jumbo frames with multi-buffers on this design proposal there are memory allocation and layout challenges to be addressed. Do you think that support for jumbo frames has any chance to potentially land in 2021 yet? Thank you

i'm pretty sure that support for jumbo frames with XDP landed in 5.18

dankamongmen avatar Apr 10 '22 15:04 dankamongmen

Hi @tohojo. I realize that for XDP to support jumbo frames with multi-buffers on this design proposal there are memory allocation and layout challenges to be addressed. Do you think that support for jumbo frames has any chance to potentially land in 2021 yet? Thank you

i'm pretty sure that support for jumbo frames with XDP landed in 5.18

@dankamongmen thanks for this update, good to know.

viniarck avatar Apr 10 '22 23:04 viniarck