rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

DirectIO WAL Write does not honor DBOptions.use_fsync

Open jsygit opened this issue 1 year ago • 2 comments

In my test using direct io for WAL write (with ext4 in the kernel), I noticed direct io WAL writes do not do metadata sync even when I have DBOptions.use_fsync = true. I understand that the actual data is persistent with direct IO, but the metadata is not flushed. I think we may be exposed to metadata loss in the case of sudden power outage? Please correct me if otherwise.

In the code, the function WritableFileWriter::Sync(bool use_fsync) is excluding SyncInternal for directio.

Note: Please use Issues only for bug reports. For questions, discussions, feature requests, etc. post to dev group: https://groups.google.com/forum/#!forum/rocksdb or https://www.facebook.com/groups/rocksdb.dev

Expected behavior

Actual behavior

Steps to reproduce the behavior

jsygit avatar Feb 22 '24 19:02 jsygit

I tried a quick fix by appending the O_DSYNC flag in env_posix.cc's OpenWritableFile() for direct IO. I was able to see the sync happening via kernel tracing and blkrace.

jsygit avatar Feb 22 '24 19:02 jsygit

We don't use direct I/O for WAL files: see https://github.com/facebook/rocksdb/wiki/Direct-IO and https://github.com/facebook/rocksdb/issues/12136.

I noticed direct io WAL writes do not do metadata sync even when I have DBOptions.use_fsync = true.

Did you set WriteOptions::sync to true or call SyncWAL()? Setting DBOptions.use_fsync to true only means we use fsync instead of fdatasync when sync is needed.

cbi42 avatar Feb 22 '24 22:02 cbi42