liburing
liburing copied to clipboard
Improvement: Better IORING_MSG_SEND_FD interface
The current interface for sending uring-registered files between different urings is rather awkward if you plan to send and receive files from multiple urings (i.e. not in a single-producer-multi-consumer fashion), as before you send the file you need to communicate with the receiver as to which array index the file should be installed.
I propose to defer the choice of the array index in which the file will be installed to the receiver using the following API:
- The receiver submits a IORING_OP_MSG_RING with sqe->addr set to IORING_MSG_RECV_FD and sqe->file_index set to the desired array index or IORING_FILE_INDEX_ALLOC to let the kernel pick the index and waits until another uring submits a IORING_MSG_SEND_FD
- When a sender submits a IORING_MSG_SEND_FD (which ignores sqe->file_index from now on) it waits until the corresponding receiver has submitted an IORING_MSG_RECV_FD
- Both the sender and the receiver receive a CQE with the user data and res set to the other end's user_data and len parameters (or maybe even use the parameters of the submitted SQE of the current uring)
This API has the following advantages:
- It makes use of IORING_FILE_INDEX_ALLOC relieving the user of manual file index management and much more closely resemble the file descriptor interface of UNIX/POSIX
- It maintains a nice invariant that users can rely on: for each SQE submitted, a CQE will be posted unless otherwise explicitly stated.
This API can even be extended to plain IORING_MSG_DATA to maintain the second point
Instead of specifying to which index you want to install a file you can pass IORING_FILE_INDEX_ALLOC and the kernel will choose an empty slot for you. Note, it's probably IORING_FILE_INDEX_ALLOC - 1 if you use helpers like io_uring_prep_msg_ring_fd. It should cover it.
Also, if you want to restrict indexes for slot auto-selection, i.e. IORING_FILE_INDEX_ALLOC, you can use IORING_REGISTER_FILE_ALLOC_RANGE or a helper around it called io_uring_register_file_alloc_range().
The problem with using IORING_FILE_INDEX_ALLOC in the current implementation is that you can't know to which slot the kernel installed the file into.
I totally forgot about this and I would like to modify my interface with len not being set to the other side's res, but instead res will be set to the file index
The problem with using
IORING_FILE_INDEX_ALLOCin the current implementation is that you can't know to which slot the kernel installed the file into.
Indeed, if that's the case then we should just return the index in cqe->res, I'll take a look
Also note that if -EOVERFLOW occurs then cqe->res should be set to the index in the sender
Also note that if
-EOVERFLOWoccurs then cqe->res should be set to the index in the sender
That's what it does, but I don't think we have a test for that. It'd be a great contribution if you (or anyone else) want to submit a test case!
P.S. tests for fd passing msg-ring are in test/fd-pass.c and for normal ones in test/msg-ring.c
We queued up patch, now for auto allocation it'll return the selected index in the target's cqe->res. It'll hit upstream in some time and get backported after.
What I meant was that if the target's completion queue is full then a -EOVERFLOW CQE will be posted on the sender's end.
Instead, the res field of this CQE should be set to the index of the file, as according to this comment in msg_ring.c:
This means that if this request completes with -EOVERFLOW, then the sender must ensure that a later RING_OP_MSG_RING delivers the message
But in order to notify the target of the new file we need to actually know the index in order to communicate this information. I can go ahead and implement that but I need to know what strategy to use:
- (easy)
-EOVERFLOWis no longer considered an error and ifio_post_aux_cqefails inio_msg_install_completethen it returns the index of the newly installed file. This makesio_msg_ringnot executereq_set_failwhich is why-EOVERFLOWis no longer an error - (trickier) Keep
-EOVERFLOWan error. But in order to do thatio_msg_install_completeneeds to communicate back additional information: both that a-EOVERFLOWerror occurred AND the index of the new file to let the user know.