risinglight
risinglight copied to clipboard
storage: support direct I/O
Now we use buffered I/O, which might make benchmarking inaccurate.
- [ ] Align blocks in column builder
- [ ] Support direct I/O in column reader
And it will also take up precious memory of the server...
Let me have a try!
Not all fs support O_DIRECT
, such as tmpfs
which we have tested over it.
test storage::secondary::column::primitive_column_factory::tests::test_scan_rle_i32 ... FAILED
test storage::secondary::rowset::rowset_iterator::tests::test_rowset_iterator_with_filter ... FAILED
test storage::secondary::column::primitive_column_factory::tests::test_scan_i32 ... FAILED
test storage::secondary::rowset::rowset_iterator::tests::test_rowset_iterator ... FAILED
test storage::secondary::column::primitive_column_factory::tests::test_skip ... FAILED
test storage::secondary::column::primitive_column_factory::tests::test_skip_multiple_times ... FAILED
test storage::secondary::rowset::disk_rowset::tests::test_get_block ... FAILED
test storage::secondary::rowset::rowset_writer::tests::test_rowset_flush ... ok
failures:
---- storage::secondary::column::primitive_column_factory::tests::test_scan_rle_i32 stdout ----
thread 'storage::secondary::column::primitive_column_factory::tests::test_scan_rle_i32' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
disabled backtrace', src/storage/secondary/rowset/disk_rowset.rs:243:14
---- storage::secondary::rowset::rowset_iterator::tests::test_rowset_iterator_with_filter stdout ----
thread 'storage::secondary::rowset::rowset_iterator::tests::test_rowset_iterator_with_filter' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
disabled backtrace', src/storage/secondary/rowset/disk_rowset.rs:198:14
Here are possible solutions:
- Check the fs capability while starting up.
- Add a new flag or feature.
- Remove
O_DIRECT
if we meet error whileopen
.
I would prefer making O_DIRECT
part of the I/O engine.
https://github.com/risinglightdb/risinglight/blob/2e1b9fc53b5e4215b9fc994b77f1c8780a4196df/src/storage/secondary/options.rs#L8-L15
We can add a new DirectPositionedRead
engine, and use that engine by default. For tmpfs and unit tests, we still use the original PositionedRead
engine without direct I/O.
Thanks for your contribution!
So we don't need to care about index
here? It seems that IoBackend
only affects column files:
let column = Column::new(
ColumnIndex::from_bytes(&index_content)?,
match io_backend {
IOBackend::NormalRead => {
ColumnReadableFile::NormalRead(Arc::new(Mutex::new(file.into_std().await)))
}
IOBackend::PositionedRead => {
ColumnReadableFile::PositionedRead(Arc::new(file.into_std().await))
}
},
block_cache.clone(),
BlockCacheKey::default().rowset(rowset_id).column(id as u32),
);
Yes. Index is small, and I'd like to ignore it for I/O backends. If in the future it would become large (like the index in RocksDB), we can migrate it to use the same I/O backend as column blocks.
Shall we support direct I/O on MacOS?
We can disable the cache with fcntl(fd, F_NOCACHE, 1)
.
I have lost interest in this issue. Feel free to take this~
Previous work: https://github.com/risinglightdb/risinglight/pull/532