async-io icon indicating copy to clipboard operation
async-io copied to clipboard

Fix race condition in `fill_read_buffer`

Open maruth-stripe opened this issue 2 years ago • 1 comments

fill_read_buffer has a race condition which can potentially lead to data loss. Consider the following sequence of events:

  • init
  • Fiber 1 issues a read of 16K bytes. This invokes fill_read_buffer with a size of BLOCK_SIZE (say, 64K) bytes.
  • @read_buffer is currently empty, so we call @io.read_nonblock(size, input_buffer, exception: false)
  • Suppose the underlying @io is a socket, and the socket is empty. Then, @io.read_nonblock will yield (on stable-v1 because of the async_send instrumentation, and on main because of the io_read hook in rb_fiber_scheduler_io_read_memory in the VM, leading to io_wait when the read would've blocked.
  • Suppose Fiber 2 issues a read of 16K bytes. This invokes fill_read_buffer with a size of BLOCK_SIZE (64K) bytes.
  • Suppose the read succeeds, it consumes 16K bytes, leaving 48K bytes in @read_buffer
  • Fiber 1 resumes, and retries @io.read_nonblock(size, input_buffer, exception: false)
  • The read succeeds. But read_nonblock nukes the contents of input_buffer leading to a data loss of 48K bytes.

This PR fixes this race condition by using a fresh buffer every time

cc @fables-tales

Types of Changes

  • Bug fix.

Contribution

maruth-stripe avatar Oct 11 '23 20:10 maruth-stripe

In principle I agree with your assessment and the proposed fix. However, I question the premise - in what scenario do we read from the same stream using two fibers and how is any kind of correctness maintained? I think the expectation is only one fiber uses a stream - otherwise without any other synchronisation, I'm not sure what we are expecting.

ioquatix avatar Oct 11 '23 22:10 ioquatix