xdp-tutorial
xdp-tutorial copied to clipboard
af_xdp sending with run to completion mode
How can we send (tx) packet using af_xdp with run to completion mode. All the example I am seeing are with a sendto option. (Which is not the one with best performance in a multi core system) Do we have an example of tx having run to completion model.
The one system I am using is not sending packet, when the kick_tx (sendto call to kernel) is not called. I haven't set any need_wakeup flag. Still the ring return wakeup_required true. How to resolve this?
There is no thread that does Tx in the SKB path of Linux. Only Rx is performed in a thread. If you are interested in zero-copy mode, then Rx and Tx will run in the same thread. The sendto() call is only there to make sure that this thread is woken up, but if you know you have Rx traffic, then there is no need doing this, though you will not send anything until you receive something.
If you want the best possible performance but do not care about wasting cores, you can disable the need_wakeup mode. Without thiss mode, the Rx and Tx thread starts busy-spinning and goes on for ever. The only thing you need is either one sendto() call or receive one packet and it will start. Make sure the app and the interrupt (not that you will get any except for the first packet) are mapped to different cores.
How can I disable need_wakeup mode? Is the mode is on by default? Because I haven't set any flag for that.
xsk_ring_prod__needs_wakeup is returning true. Is it something we can change? I have read in a doc xdp_copy mode tx require send to all the time. I am running in xdp_copy mode also. @magnus-karlsson
@Adarsh97 You can do this when creating the XSK (AF_XDP socket), example:
struct xsk_socket_config cfg;
auto xsk = new xsk_socket_info;
struct xsk_ring_cons *rxr;
struct xsk_ring_prod *txr;
int ret;
if (!xsk)
exit_with_error(errno);
xsk->umem = umem;
cfg.rx_size = NUM_FRAMES;
cfg.tx_size = NUM_FRAMES;
cfg.libxdp_flags = XSK_LIBXDP_FLAGS__INHIBIT_PROG_LOAD;
if (opt_attach_mode == XDP_MODE_SKB)
cfg.xdp_flags = XDP_FLAGS_SKB_MODE;
else
cfg.xdp_flags = XDP_FLAGS_DRV_MODE;
cfg.bind_flags = opt_xdp_bind_flags;
Are you seeing this last line (cfg.bind_flags) ?
By default, in the example files it is defined like this:
static u32 opt_xdp_bind_flags = XDP_USE_NEED_WAKEUP;
Just remove it from your program, or better, if you are using the xdpsock example, pass the -m flag when starting up.
The -m flag (according to the xdpsock example code) indicates that you should not use WakeUp, and unchecks the XDP_USE_NEED_WAKEUP flag
So, to answer your question:
Is the mode is on by default?
No, it is not by default, you need to define it explicitly, but in the example files it is already explicitly defined by default.
Hi, Actually I am not using opt_need_wakeup flag, below i am attaching my code snippet
`copy_mode_flg=0; // copy_mode xdp_mode_flg=3; //skb_mode
int main() { cfg.xdp_flags=0; cfg.xsk_bind_flags=0;
cfg.xdp_flags |= XDP_FLAGS_SKB_MODE; /* Set flag */ cfg.xsk_bind_flags |= XDP_COPY;
if (copy_mode_flg == 1) // zero_copy { cfg.xsk_bind_flags |= XDP_ZEROCOPY; } else { cfg.xsk_bind_flags |= XDP_COPY; }
if(xdp_mode_flg == 1) // hw_offload { cfg.xdp_flags |= XDP_FLAGS_HW_MODE; } else if(xdp_mode_flg == 2) // drive mode { cfg.xdp_flags |= XDP_FLAGS_DRV_MODE; /* Set flag */ } else if(xdp_mode_flg == 3) // generic mode { cfg.xdp_flags |=XDP_FLAGS_SKB_MODE; }
bpf_obj = __load_bpf_and_xdp_attach(&cfg);
xsk_configure_socket(umem, rx, tx,0,&cfg);
}
static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, bool rx, bool tx, int queueId,struct config *config_ptr) { struct xsk_socket_config cfg; struct xsk_socket_info *xsk; struct xsk_ring_cons *rxr; struct xsk_ring_prod *txr; int ret;
static u32 prog_id;
xsk = calloc(1, sizeof(*xsk));
if (!xsk)
return NULL;
//exit_with_error(errno);
config_ptr->xsk_if_queue=queueId;
xsk->umem = umem;
cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS;
cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS;
cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD;
cfg.xdp_flags = config_ptr->xdp_flags;
cfg.bind_flags = config_ptr->xsk_bind_flags;
rxr = rx ? &xsk->rx : NULL;
txr = tx ? &xsk->tx : NULL;
ret = xsk_socket__create(&xsk->xsk, config_ptr->ifname, config_ptr->xsk_if_queue, umem→umem, rxr, txr, &cfg);
}`
kindly check above code and please let me know anything wrong i am doing here. Because the observation are
- xsk_ring_prod__needs_wakeup is returning true on my TX ring
- without kick_tx() (sendto) packets are not sending from kernel.
Here it worked normally, I didn't need to call kick_tx to send the packets.
Please check with search (ctrl + f) if there is any part of the code that is setting XDP_USE_NEED_WAKEUP
My answers/questions:
1: If you are not using need_wakeup, you should ignore this flag. It is hardcoded to 1 if you are not using it, this so that applications that rely on it still work as expected. 2: What mode are you running in and what NIC are you on?
Hi, I have modified the code in below way, cfg.xdp_flags=0; cfg.xsk_bind_flags=0;
I am not specifically setting any flag ( i assume there is an automatic fallback present, if the highest one is not supported) I ignored need_wakeup flag, what i observed is sometimes packets are not sending by kernel (if i am not using sendto()).
so the sample code looks like this,
while(1) { xdp_send(); sleep(); }
xdp_send() { cfg.xdp_flags=0; cfg.xsk_bind_flags=0;
without sendto calling tx_only function }
Observations are 1: sometimes all the packets are sending 2: sometime no packets are sending 3: sometimes packet start sending in between it stops
so temporary workround i found is call sendto in an interval, then it is working. But I know it is not optimal. So what I am doing here wrongly ?
The info of the system is as follows, OS : RHEL 8.4 Kernel : 4.18 Driver : i40e Fibre port
You will get the default config parameters if you enter NULL as your cfg. If you specify any flags, then there is no fallback. What you specify is what you get. Please upgrade to a new kernel like 6.7 from kernel.org and try again. 4.18 plus whatever patches and backports RedHat puts on top of the kernel is really old and largely unknown in its content. Who knows if it works.
You can verify if the kernel is constantly polling for packets without syscalls and interrupts by just doing a top and checking if the correct ksoftirqd threads is at 100% load. You could also use the xdpsock sample with the -m option and see if you get the desired behavior with that.
@magnus-karlsson Please correct me if I'm wrong
There are some exceptions, for example, in the Linux source code itself, there is an observation saying that if the driver does not implement the functionality, you must always use poll or sendto.
Look:
/* Tx needs to be explicitly woken up the first time. Also
* for supporting drivers that do not implement this
* feature. They will always have to call sendto() or poll().
*/
https://github.com/torvalds/linux/blob/cf1182944c7cc9f1c21a8a44e0d29abe12527412/net/xdp/xsk_buff_pool.c#L194
I believe that virtio_net does not implement this functionality, according to PATCH (https://www.mail-archive.com/[email protected]/msg406405.html) we can see that the final version has not yet been released.
And also in this patch, we see that a zero copy version of XDP will be implemented for the virtio-net driver
That said, @Adarsh97 is your network driver virtio_net?
Hi, @incapdns I am using i40e driver.
More specifically OS : RHEL 8.4 Kernel : 4.18 Driver : i40e Fibre port (with sfp module)
I can't tell if the i40e driver is up to date, have you tried updating the kernel to the latest version?
@magnus-karlsson Please correct me if I'm wrong
There are some exceptions, for example, in the Linux source code itself, there is an observation saying that if the driver does not implement the functionality, you must always use poll or sendto.
Look:
/* Tx needs to be explicitly woken up the first time. Also * for supporting drivers that do not implement this * feature. They will always have to call sendto() or poll(). */
If you would like to write a program that works in any mode, then the statement above is correct. You always have to call sendto() (or poll()) as the sending is not threaded in skb mode. If you, however, know you are in zero-copy mode, you can "cheat" and only call sendto() once as the softirq thread spins forever. Well, at least for real physical drivers.
https://github.com/torvalds/linux/blob/cf1182944c7cc9f1c21a8a44e0d29abe12527412/net/xdp/xsk_buff_pool.c#L194
I believe that virtio_net does not implement this functionality, according to PATCH (https://www.mail-archive.com/[email protected]/msg406405.html) we can see that the final version has not yet been released.
And also in this patch, we see that a zero copy version of XDP will be implemented for the virtio-net driver
That said, @Adarsh97 is your network driver virtio_net?
Hi, I am trying with a latest version of kernel. But i am facing an issue, getting this error <bpf/xsk.h> no such file or directory When i have checked the bpf folder i am unable to find this file. ( I have done libbpf-dev install with sudo apt-get install) Is this actually removed in the later version ? The kernel version I am trying is 6.5.0-17-generic (Ubuntu 23.10)
sendto() (or poll()) as the sending is not threaded in skb mode. If you, however,
Plz try these commands:
apt-get update -y
apt-get install libxdp-dev -y
I have tried that.
Still getting errors like
undefined reference to xsk_socket__fd' undefined reference to xsk_umem__delete'
Hope these are related to xsk.h
still i am unable to locate xsk.h file
Plz try these commands:
apt install clang llvm libelf-dev libpcap-dev build-essential libc6-dev-i386
Note "-dev"
And when linking the .o files, don't forget to put -lbpf and -xdp
@Adarsh97 It worked out ?