liburing liburing giving similar throughput as select/epoll

I am currently working on this branch adding io_uring as an option to receive data for a linux networking benchmark tool running n Ubuntu with Linux Kernel 5.13.

When I run the benchmark test using liburing or select, I get similar throughput levels and am unsure why that would be since liburing is asynchronous. Here is the file where I am using liburing to receive data. Not sure what I am missing to get the performance boost that io_uring offers.

Aug 18 '21 18:08 MarcoF1

I have tried using register buffers and register files, but they don't seem to help much with performance. As well as enabling the polling mode, but that only seems to increase CPU usage

Aug 18 '21 18:08 MarcoF1

I have tried using register buffers and register files, but they don't seem to help much with performance. As well as enabling the polling mode, but that only seems to increase CPU usage

In your code tcpstream.c, SQPOLL and IOPOLL features are not used, you can try them.

Aug 19 '21 03:08 HuanjunXie

It seems the performance is not guaranteed yet.

Similar issue here #189

performance issue

Aug 19 '21 09:08 mohsenomidi

@mohsenomidi, I always had problems with that test. Apart from not testing anything useful, the numbers are least strange... Just run it on my laptop, 5.14. The number of iteration is increased x100 for pipes=100, otherwise under 1s.

# iter count = 1000000,
> taskset -c 3 ./io_uring 100
Pipes: 100
Time: 42.052680
> taskset -c 3 ./epoll 100
Pipes: 100
Time: 86.737568

# iter count = 10000,
> taskset -c 3 ./epoll 500
Pipes: 500
Time: 5.386944 # edited from 48.026645
> taskset -c 3 ./io_uring 500
Pipes: 500
Time: 2.106056

edit: for pipes=500, that's me screwing the test, so it's 5.3 vs 2.1

Aug 22 '21 11:08 isilence

p.s. the second result (42 vs 2) looks weird, may be the test being buggy.

Aug 22 '21 11:08 isilence

I have tried using register buffers and register files, but they don't seem to help much with performance. As well as enabling the polling mode, but that only seems to increase CPU usage

In your code tcpstream.c, SQPOLL and IOPOLL features are not used, you can try them.

Sockets don't support IOPOLL. Neither SQPOLL should be needed and may complicate the code.

Aug 22 '21 11:08 isilence

I have tried using register buffers and register files, but they don't seem to help much with performance. As well as enabling the polling mode, but that only seems to increase CPU usage

@MarcoF1, good work. You don't use registered files unless sqe->flags |= IOSQE_FIXED_FILE is set.

Aug 22 '21 11:08 isilence

@MarcoF1, it needs to be investigated, but my guess is that you don't do enough of batching. I.e. submitting several requests at once, and if I'm reading your code correctly it won't be of much difference. For instance, each io_uring_submit() ends up doing a syscall per request.

What is the magnitude of the difference? Can you share some numbers? Also, I guess there are N parallel clients running and each using a io_uring instance. Right?

Aug 22 '21 12:08 isilence

liburing liburing copied to clipboard

liburing giving similar throughput as select/epoll

liburing
liburing copied to clipboard