Jens Axboe

Results 413 comments of Jens Axboe

I'm surprised you see a difference with send(2) vs IORING_OP_SEND for that case, have you done any profiling that could shed some light on this? Assuming your application is threaded,...

Some opcodes are just never going to be super fast, until they get converted to being able to do nonblocking issue. UNLINK is one of them - it'll always return...

> t'd be neat if there was a "do it inline anyway" flag or perhaps that can be inferred from a iowq_max_workers = [1, 0] + submit_and_wait(queue_size) call (though a...

I can do a quick hack of that if you want to test it...

Something like the below, totally untested... ``` diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 1d59c816a5b8..539b1e3ac695 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -442,6 +442,7 @@ struct io_cqring_offsets { #define IORING_ENTER_SQ_WAIT (1U

Easiest way to test would be to just enable it unconditionally in liburing and recompile the library and your test app. Something like this, again utterly untested: ``` diff --git...

You need to patch and build the kernel, I'm afraid. If you want, send me your test app and I can run it here and we can compare. That might...

``` axboe@r7525 ~/gi/fuc (io_uring_rmz)> hyperfine --warmup 3 -N --prepare "ftzz g -n 1000K /dev/shm/ftzz" "taskset -c 0 ./patchless /dev/shm/ftzz" "taskset -c 0 ./force /dev/shm/ftzz" Benchmark 1: taskset -c 0 ./patchless...

Not sure how representative this is on tmpfs, as profiles show we're spending most of the time on grabbing a lock for eviction: ``` + 85.36% force [kernel.vmlinux] [k] queued_spin_lock_slowpath...

Also note that my testing was done with all kinds of security mitigations turned off, obviously the batching done would greatly benefit with mitigations turned ON as the syscalls and...