dbus-sensors
dbus-sensors copied to clipboard
Intermitent hangs observed with io_uring on kernel 5.4
Since the upgrade to using io_uring, we've observed intermitent hangs of dbus-sensor daemons on BMCs running a 5.4 kernel. The hang always happens inside io_uring_enter() - enqueued reads never return (they actually return exactly after 5 min due to some sort of internal timeout), which causes the entire service to stall.
Upgrading the BMC kernel to 5.10 magically causes the above issue to dissapear. Scavenging io_uring lore, I found that this problem has been reported previously (see https://github.com/axboe/liburing/issues/205) against 5.4. Unfortunately, the only solution suggested was to upgrade the kernel, which is not possible for us. It is also unclear as to what kernel patches would be able to resolve the hang.
Disabling io_uring support in dbus-sensors is currently non-trivial, as it requires an API change with regards to ASIO usage. It would be great to add build option to specify which backend to use - epoll vs uring. This way it would be simpler to configure the daemons against different kernel versions.