dbus-sensors icon indicating copy to clipboard operation
dbus-sensors copied to clipboard

Intermitent hangs observed with io_uring on kernel 5.4

Open vsytch opened this issue 2 years ago • 4 comments

Since the upgrade to using io_uring, we've observed intermitent hangs of dbus-sensor daemons on BMCs running a 5.4 kernel. The hang always happens inside io_uring_enter() - enqueued reads never return (they actually return exactly after 5 min due to some sort of internal timeout), which causes the entire service to stall.

Upgrading the BMC kernel to 5.10 magically causes the above issue to dissapear. Scavenging io_uring lore, I found that this problem has been reported previously (see https://github.com/axboe/liburing/issues/205) against 5.4. Unfortunately, the only solution suggested was to upgrade the kernel, which is not possible for us. It is also unclear as to what kernel patches would be able to resolve the hang.

Disabling io_uring support in dbus-sensors is currently non-trivial, as it requires an API change with regards to ASIO usage. It would be great to add build option to specify which backend to use - epoll vs uring. This way it would be simpler to configure the daemons against different kernel versions.

vsytch avatar Sep 12 '22 21:09 vsytch