seastar icon indicating copy to clipboard operation
seastar copied to clipboard

reactor: io_uring: enable some optimization flags

Open avikivity opened this issue 1 year ago • 13 comments

Enable some optimization flags in an attempt to improve performance with io_uring:

IORING_SETUP_COOP_TASKRUN - prevents a completion from interrupting the reactor if it is running. Requires that the reactor issue an io_uring_enter system call in a timely fashion, but thanks to the task quota timer, we do.

IORING_SETUP_TASKRUN_FLAG - sets up a flag that notifies the reactor that the kernel has pending completions that it did not process. This allows the reactor to issue an io_uring_enter even if it has no pending submission queue entries or completion queue entries (e.g. it indicates a third queue, in the kernel, is not empty).

IORING_SETUP_SINGLE_ISSUER - elides some locking by guaranteeing that only a single thread plays with the ring; this happens to be true for us.

IORING_SETUP_DEFER_TASKRUN - batches up completion processing in an attempt to get some more performance.

This flags bump up the dependencies to Linux 6.1 and liburing 2.2. This seems worthwhile as right now io-uring lags behind linux-aio (which processes completions from interrupt context and therefore doesn't need all these optimizations).

After this exercise, io_uring is still slower than linux-aio.

avikivity avatar Feb 09 '24 18:02 avikivity

This flags bump up the dependencies to Linux 6.1 and liburing 2.2. This seems worthwhile as right now io-uring lags behind linux-aio (which processes completions from interrupt context and therefore doesn't need all these optimizations). However, I don't know how to specify the liburing version requirement.

lemme create a change to bump the required version to v2.2

tchaikov avatar Feb 10 '24 16:02 tchaikov

following patch would do the trick:

diff --git a/cmake/SeastarDependencies.cmake b/cmake/SeastarDependencies.cmake
index 6c80d0fa..9aff230b 100644
--- a/cmake/SeastarDependencies.cmake
+++ b/cmake/SeastarDependencies.cmake
@@ -133,7 +133,7 @@ macro (seastar_find_dependencies)
   seastar_set_dep_args (Protobuf REQUIRED
     VERSION 2.5.0)
   seastar_set_dep_args (LibUring
-    VERSION 2.0
+    VERSION 2.2
     OPTION ${Seastar_IO_URING})
   seastar_set_dep_args (StdAtomic REQUIRED)
   seastar_set_dep_args (hwloc

tchaikov avatar Feb 11 '24 05:02 tchaikov

Thanks, I'll update my patch. But unfortunately I still see a 10% slowdown, I don't think my changes did anything.

avikivity avatar Feb 11 '24 14:02 avikivity

v2: applies patch from @tchaikov to bump the uring version dependency

avikivity avatar Feb 11 '24 15:02 avikivity

Enable some optimization flags in an attempt to improve performance with io_uring: .. After this exercise, io_uring is still slower than linux-aio.

Can you share some example numbers on how much io_uring is still slower than linux-aio, and how much did your patch improve?

nyh avatar Feb 11 '24 15:02 nyh

Client:

ab -k -c 100 -n 1000000 http://localhost:10000/

Server:

./build/release/apps/httpd/httpd --smp 1 --cpuset 0 -m 1G --reactor-backend linux-aio

Requests per second: 110300.38 [#/sec] (mean)

Server:

./build/release/apps/httpd/httpd --smp 1 --cpuset 0 -m 1G --reactor-backend io_uring

Requests per second: 96563.98 [#/sec] (mean)

Usually linux-aio/io_uring are within 10% of each other. I'm not sure if my patch improves anything, the results are noisy.

avikivity avatar Feb 11 '24 16:02 avikivity

The most I was able to see is a large difference in IPC, but I have no idea how that happens.

avikivity avatar Feb 11 '24 16:02 avikivity

I did find a difference:

[avi@avi seastar (uring-poll-first)]$ sudo perf stat -a -e irq_vectors:reschedule_entry  ./build/release/apps/httpd/httpd --smp 1 --cpuset 0 -m 1G --reactor-backend io_uring
INFO  2024-02-11 18:18:18,608 seastar - Reactor backend: io_uring
starting prometheus API server
Seastar HTTP server listening on port 10000 ...
^CStoppping HTTP server
Stoppping Prometheus server

 Performance counter stats for 'system wide':

           980,159      irq_vectors:reschedule_entry                                          

      14.433292048 seconds time elapsed

[avi@avi seastar (uring-poll-first)]$ sudo perf stat -a -e irq_vectors:reschedule_entry  ./build/release/apps/httpd/httpd --smp 1 --cpuset 0 -m 1G --reactor-backend linux-aio
INFO  2024-02-11 18:18:40,236 seastar - Reactor backend: linux-aio
starting prometheus API server
Seastar HTTP server listening on port 10000 ...
^CStoppping HTTP server
Stoppping Prometheus server

 Performance counter stats for 'system wide':

               496      irq_vectors:reschedule_entry                                          

      13.693591716 seconds time elapsed

avikivity avatar Feb 11 '24 16:02 avikivity

I thought the patch addresses those self-interrupts, but maybe not)

avikivity avatar Feb 11 '24 16:02 avikivity

Ah that run was without the patch.

avikivity avatar Feb 11 '24 16:02 avikivity

Unfortunately the patch doesn't help sufficiently, it's still slower.

avikivity avatar Feb 11 '24 16:02 avikivity

Follow-up: https://github.com/scylladb/seastar/pull/2092

avikivity avatar Feb 11 '24 16:02 avikivity

On ARM, these two patches change io_uring from ~ -12% to ~ +2%. So maybe there's hope.

avikivity avatar Feb 12 '24 15:02 avikivity