rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

Signal delivery during `io_uring_wait_cqe` call leads to crash

Open mirgee opened this issue 8 months ago • 0 comments

I encountered a consistent crash in RocksDB compiled with coroutines and io_uring enabled.

Any signal arriving during the blocking call to io_uring_wait_cqe() (which calls io_uring_enter syscall under the hood) leads to a non-zero return code (specifically 11 EAGAIN), in turn leading to an abort due to this code in IOStatus Poll() in env/fs_posix.cc:

        ssize_t ret = io_uring_wait_cqe(iu, &cqe);
        if (ret) {
          // abort as it shouldn't be in indeterminate state and there is no
          // good way currently to handle this error.
          abort();
        }

This can be reliably reproduced using signal-based profilers or manually sending signals to the thread running the poll loop.

Since signal-based profiling is common and returning EAGAIN or EINTR is normal syscall behavior under signal delivery, these return codes should be handled by RocksDB.

One possible albeit very naive fix would be to simply treat this category of errors as transient and retry immediately (perhaps with small number of retries with exponential backoff):

while (true) {
  ret = io_uring_wait_cqe(iu, &cqe);
  if (ret == -EAGAIN || ret == -EINTR) {
    continue;
  }
  if (ret) {
    return IOStatus::IOError("io_uring_wait_cqe failed: " + errno);
  }
  break;
}

How do you think a non-zero status code, and EAGAIN / EINTR in particular, should be handled? I am happy to submit a patch once we agree on appropriate handling strategy.

Thank you!

mirgee avatar May 12 '25 12:05 mirgee