Fix race condition in `fill_read_buffer`

Open maruth-stripe opened this issue 2 years ago • 1 comments

fill_read_buffer has a race condition which can potentially lead to data loss. Consider the following sequence of events:

init
Fiber 1 issues a read of 16K bytes. This invokes fill_read_buffer with a size of BLOCK_SIZE (say, 64K) bytes.
@read_buffer is currently empty, so we call @io.read_nonblock(size, input_buffer, exception: false)
Suppose the underlying @io is a socket, and the socket is empty. Then, @io.read_nonblock will yield (on stable-v1 because of the async_send instrumentation, and on main because of the io_read hook in rb_fiber_scheduler_io_read_memory in the VM, leading to io_wait when the read would've blocked.
Suppose Fiber 2 issues a read of 16K bytes. This invokes fill_read_buffer with a size of BLOCK_SIZE (64K) bytes.
Suppose the read succeeds, it consumes 16K bytes, leaving 48K bytes in @read_buffer
Fiber 1 resumes, and retries @io.read_nonblock(size, input_buffer, exception: false)
The read succeeds. But read_nonblock nukes the contents of input_buffer leading to a data loss of 48K bytes.

This PR fixes this race condition by using a fresh buffer every time

cc @fables-tales

Types of Changes

Bug fix.

Contribution

[ ] I added tests for my changes.
[ ] I tested my changes locally.
[X] I agree to the Developer's Certificate of Origin 1.1.

Oct 11 '23 20:10 maruth-stripe

In principle I agree with your assessment and the proposed fix. However, I question the premise - in what scenario do we read from the same stream using two fibers and how is any kind of correctness maintained? I think the expectation is only one fiber uses a stream - otherwise without any other synchronisation, I'm not sure what we are expecting.

Oct 11 '23 22:10 ioquatix