rr icon indicating copy to clipboard operation
rr copied to clipboard

Support `io_uring`

Open rocallahan opened this issue 4 years ago • 8 comments

Here's a possible approach.

  • On io_uring_setup, create a file monitor identifying the fd as an io_uring fd.
  • When that fd is mapped, remove any MAP_FIXED flag and set the prot flags to read/write and let the syscall proceed. This returns the address of the real uring buffer. Map a same-sized area of memory for the application's use (reapplying MAP_FIXED and with the right prot flags, if necessary) and return that address to the application. rr remembers the connection between the two buffers; when the fake uring buffer is unmapped, we also have to unmap the real buffer.
  • Before and after io_uring_enter, and, possibly at other times when we trap to rr, if there are submission queue entries in a fake buffer that haven't been copied to the real buffer, copy them, update the fake buffer head pointer, and record that change. Also remember any user-space memory ranges that the kernel may write to, associated with their queue entry.
  • Before and after io_uring_enter, and, possibly at other times when we trap to rr, If there are completion queue entries in the real buffer that haven't been copied to the fake buffer, copy and record them, and also record any associated user-space buffers.

This won't be very fast, since in many cases it will mean more io_ring_enter syscalls than without rr, and all io_uring_enter syscalls will require trapping to rr (i.e. 4 context switches), but if the submission queue is large then we will batch a lot of I/O operations per trap --- a bit like syscallbuf. (Trying to integrate io_uring with syscallbuf seems pointless since we get the batching effect as-is. If necessary we could make the real buffers bigger than the fake buffers.) So performance might be close to as good as one could expect.

This assumes application threads don't race with the kernel's writes to user-space I/O buffers. If we don't want to assume that, we can extend this to allocate additional scratch buffers, rewrite submission-queue entries to point to those buffers, and copy the contents of those buffers to the right place when we see new completion queue entries.

rocallahan avatar Jun 29 '20 02:06 rocallahan