rocksdb
rocksdb copied to clipboard
WritableFileWriter::WriteDirect's buf_.RefitTail() causes redundant disk write when an immediate Flush() is invoked right after.
In Log Writer's WritableFileWriter::WriteDirect(), an unaligned write using would result in rounding up the buffer to an aligned size. It also leaves a "leftover_tail" bytes with the intention to be merged with a subsequent write to overwrite the padded region.
buf_.RefitTail() is invoked inside WriteDirect()... which resets the buffer but still leaving the tail bytes as valid. The result of RefitTail() is setting the buf.CurrentSize() to the tail bytes. If WritableFileWriter::Flush() is invoked right after (before a subsequent write comes), then the logic would see a non-zero buf.CurrentSize() and again flush the tail bytes. Consequently, we're seeing 2 disk flushes for every 1 unaligned direct io write.
The above scenario is very common when we enable direct_io and sync (sync invokes Flush()).
Expected behavior
The tail bytes should not be flushed again.
Actual behavior
The tail bytes were flushed.
Steps to reproduce the behavior
- Setup a WritableFileWriter to use direct_io
- Disable manualy sync (this would cause a sync to occur after every write)
- Send an unaligned write (i.e. 1030 bytes) ... in my test my aligned size is 512B.
- Verify that WriteDirect() would issue the write with 1536 bytes, then follows up with a 512 bytes write in the sync() to flush the tail bytes.
Nice finding. We should track whether the buffer is clean or dirty, and skip the write before sync when it's clean
Hey!
I attended in-person meetup last month at Menlo Park (RocksDB)
I just took a look at this today. As I am new and just started contributing this project. Do you mind walk me through getting started such as setting up and test this issue happens? Once it runs I will better able to find out and workaround to resolve the problem mention above. @ajkr @jsygit @akankshamahajan15
I went through WriteDirect() function written inside class WritableFileWrite and studied buf_.RefitTail(file_advance, leftover_tail);
Thanks, 🙂