kvrocks
kvrocks copied to clipboard
rework compaction filter when used blobs
Search before asking
- [X] I had searched in the issues and found no similar issues.
Motivation
We have quite big messages (~100Mib), so we decided to use BlobDB to decrease pressure on disk. But this action decreases write throughput and increase read iops because RocksDB reads data from blobfiles.
Current SubKeyFilter gets values of keys only to check bitmap emptiness. I think it's too wasteful. Every compaction is painful.
It's possible to implement FilterBlobByKey?
Solution
I've made implementation of FilterBlobByKey. We don't use bitmaps, so it works for us. This almost completely removes the reads of blobs
What do you think about it?
https://github.com/apache/incubator-kvrocks/commit/299f45a680970c84ac75759848b6fb1b9f4dd6fa
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
Cool, @ShooterIT @caipengbo @shangxiaoxiong Can help to have a look if you're free?
@tufitko This is a good optimization point, in most cases we can make a decision about a key-value solely based on the key in compaction filter.
Just as you said we can introduce new FilterBlobByKey interface to optimize it:
- For the bitmap type, we can return
kUndeterminedinFilterBlobByKeyto read the value further. - For the other types, we just need to read the key through
FilterBlobByKey.
@caipengbo so, I can create PR based on my commit? Should it be optional or nothing to worry about bitmap?
@tufitko Good job! I don't think there's anything to worry about with bitmap, just do it!