foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

io_submit seems to be a blocking call

Open mpilman opened this issue 6 years ago • 5 comments

We run into issues because io_submit is slow. In fact it can be (in rare cases) so slow that a process gets dropped because it stops to heartbeat for too long.

@kaomakino created the small test program below that reproduces the issue. The results will depend on the speed of your disk (which in itself suggests that io_submit is not truly asynchronous).

The program does the following:

  1. The main thread will write 4KB blocks into random locations within a 4GB file.
  2. A second thread will cal fdatasync every 3 seconds (this roughly simulates commits in fdb).

Calls to io_submit will then take milliseconds to complete on a fast disk. On a very slow disk, one io_submit can take several seconds (we saw up to 13 seconds).

This can be made better by doing the following:

  1. Create a 4GiB file (for example with dd - content doesn't matter)
  2. Change the attached C program so that it doesn't create the file when it opens it.

However, even than io_submit can take a very long time (~1ms on a fast disk). So this seems to be only part of the solution. Additionally, it seems to be the case that latencies of io_submit get better the smaller we make the write queue. With only 1 outstanding request at a time it seems to be 10-60 microseconds.

aiotime.c.zip

mpilman avatar Sep 27 '19 06:09 mpilman

Further testing seems to indicate that this gets better if we use XFS instead of ext4 - but still not great

mpilman avatar Sep 27 '19 07:09 mpilman

@mpilman have you seen https://stackoverflow.com/a/46377629 ?

sitsofe avatar Jan 18 '20 11:01 sitsofe

We saw this happened in TLogs (8.6s slow task) and resulted in recovery.

jzhou77 avatar Feb 19 '21 03:02 jzhou77

We saw this happened in TLogs (8.6s slow task) and resulted in recovery.

hello,how do you fix this issue about "io_submit"?

yangliu7999 avatar Nov 16 '25 13:11 yangliu7999

We have not "solved" the issue yet. We have monitoring of slow disks and replace them if we see disk problems.

jzhou77 avatar Nov 17 '25 18:11 jzhou77