liburing
liburing copied to clipboard
io_uring_wait_cqe_timeout() passes a __kernel_timespec to IORING_OP_TIMEOUT which takes a timespec64
The io_uring_wait_cqe_timeout()
function takes a pointer to a __kernel_timespec
which holds the value of the timeout.
If IORING_FEAT_EXT_ARG
feature is not set then this function falls back to being implemented in terms of __io_uring_submit_timeout()
which tries to submit an IORING_OP_TIMEOUT
request initialized by the io_uring_prep_timeout()
function. This in turn sets the addr
field of the SQE to point to the __kernel_timespec
structure.
However, the documentation for the IORING_OP_TIMEOUT
op-code says that the addr
field must point to a timespec64
structure.
Looking at the linux headers that define these types I find:
// include/uapi/linux/time_types.h
struct __kernel_timespec {
__kernel_time64_t tv_sec; // __kernel_time64_t is defined as 'long long'
long long tv_nsec;
};
whereas I see that:
// include/linux/time64.h
struct timespec64 {
time64_t tv_sec; // time64_t is __s64 which is either 'long' or 'long long' depending on arch
long tv_nsec;
};
On some 32-bit platforms, however, the long
type of the timespec64
structure is only 32-bits and the long long
type of the __kernel_timespec
structure is 64-bits, meaning that the layouts of these structures can differ on some platforms.
I gather it "just works" on both 64-bit platforms (where long
has the same layout as long long
) and also on little-endian 32-bit platforms because the __kernel_timespec
's tv_nsec
field usually only has a value in the range [0..999'999'999] and thus only populates the lower 32-bits of the 64-bit value and so when a 32-bit value is read from the timespec64
's tv_nsec
field then it just picks up the right value.
However, I think on 32-bit big-endian architectures this could lead to the tv_nsec
part being ignored as the read of the timespec64
's 32-bit tv_nsec
field would be reading the higher 32-bits of the 64-bit __kernel_timespec
tv_nsec
field, which is pretty much always going to be zero.
This could potentially lead to code that, say, passes a timeout of 500ms to io_uring_wait_cqe_timeout()
having this interpreted as if they passed a 0ms timeout and and always completing immediately - potentially leading to high CPU usage.
I'm not sure how to address this inconsistency, though, as the code-path that passes an io_uring_getevents_arg
structure to io_uring_enter2
still needs to pass the address of a __kernel_timespec
structure and so it is difficult to fix this API to be correct for both paths.
Ideally both code-paths would be able to use the same time structure.
Would it be possible to have the submit/wait timeout functions "rewrite" the passed __kernel_timespec
structure to have a timespec64
value and then "restore" the original __kernel_timespec
value before returning if the IORING_OP_TIMEOUT
code-path is taken and liburing detects that the layouts of timespec64
and __kernel_timespec
differ?