How to use io_uring engine if kernel not support io_uring_prep_poll_multishot?
If the kernel version does not support io_uring_prep_poll_multishot, are there any alternative ways to use the io_uring engine?
iouring's multi-shot poll is not mandatory, one-shot poll also has similar performance.
Looks like you are using the add_interest call of the event engine. Can you describe your scenario?
The problem we encountered is similar to the problem solved in #264. When initing iouringEngine, we need to use io_uring_prep_poll_multishot to register fd for cancel_wait. However, our kernel does not support io_uring_prep_poll_multishot. The previous solution of using a separate coroutine to handle cancel wait also seems to have problems.
So I believe there is a misunderstanding here, maybe caused by the function naming?
We use cancel_wait to do cross-vCPU coordination, for example, when we want to migrate a thread from one vCPU to another.
If you need to suspend the wait_for_fd, you should use thread_interrupt. Maybe we can discuss in the Dingtalk group.
When i use workerpool‘s async_call func to send async task, it may be call cancel_wait.
On the latest branch, if you need to initialize the io_uring engine, it is necessary to support io_uring_prep_poll_rultishot, otherwise it will fail to initialize, right?
Alright, I see.
Please provide your OS type, kernel version, and liburing version. If able to reproduce, maybe we can add some if else code to by-pass calling multi-shot. Photon's iouring module know exactly the current kernel version.
BTW, the init error message as well.
kernel version is 5.10, liburing version is 2.4. PhotonlibOS version is 0.8.2.
Due to the higher version of liburing, we are able to initialize the io_uring engine and successfully register multishot. However, based on our testing, we found that only a single "cancel wait" trigger was received. Upon reviewing the documentation, we discovered that if an event is triggered, the flags of the generated CQE will include IORING_CQE_F_MORE.
Is there any alternative if not using io_uring_prep_poll_multishot
Is there any alternative if not using io_uring_prep_poll_multishot
I'm afraid not. This is just the bug described in https://github.com/alibaba/PhotonLibOS/pull/264/files#r1398418634.
It used to be a one-shot poll + while loop, and the fix was
Should not use one-shot poll. There is a risk of missing eventfd notification, right before starting the next round of polling.
https://github.com/alibaba/PhotonLibOS/blob/1ee55db5e788c813f10f4646181eb49345c15ea0/io/iouring-wrapper.cpp#L399-L402
https://github.com/alibaba/PhotonLibOS/blob/1ee55db5e788c813f10f4646181eb49345c15ea0/io/iouring-wrapper.cpp#L437-L442
I'm thinking about not using eventfd_write but io_uring_prep_write + io_uring_submit. And in the reap loop, when confirmed the caller is from cancel_wait, just continue.
We can even use blocking read to read the eventfd. So don't need to poll?
Using io_uring_prep_write + io_uring_submit seems infeasible because io_uring is not thread-safe, and the thread invoking cancel_wait is different from the VCPU, so locking is required for protection.
However, if using a one-shot poll without employing a while loop, but instead in the reap loop, when receiving the poll's CQE, can we manually re-submit a one-shot poll SQE again?
example:
After testing, it was found that above approach can solve my problem. However, under the current version, it is still necessary to restrict the required kernel version for using iouringEngine, as io_uring_prep_poll_multishot is only supported starting from kernel version 5.13.
@MJY-HUST I'll make a formal patch for kernel less than 5.13, based on your contribution code.
@MJY-HUST Were you using a customized kernel rather than official one from upstream vendor?
My test machines is debian 11 with kernel 5.10.0-30-amd64, and multi-shot poll is still working.
@MJY-HUST What kind of project are you developing with Photon?
@MJY-HUST What kind of project are you developing with Photon?
The underlying system is a database system