kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

rework compaction filter when used blobs

Open tufitko opened this issue 3 years ago • 4 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Motivation

We have quite big messages (~100Mib), so we decided to use BlobDB to decrease pressure on disk. But this action decreases write throughput and increase read iops because RocksDB reads data from blobfiles. Current SubKeyFilter gets values of keys only to check bitmap emptiness. I think it's too wasteful. Every compaction is painful. It's possible to implement FilterBlobByKey?

Solution

I've made implementation of FilterBlobByKey. We don't use bitmaps, so it works for us. This almost completely removes the reads of blobs What do you think about it? https://github.com/apache/incubator-kvrocks/commit/299f45a680970c84ac75759848b6fb1b9f4dd6fa

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

tufitko avatar Sep 18 '22 09:09 tufitko

Cool, @ShooterIT @caipengbo @shangxiaoxiong Can help to have a look if you're free?

git-hulk avatar Sep 18 '22 10:09 git-hulk

@tufitko This is a good optimization point, in most cases we can make a decision about a key-value solely based on the key in compaction filter.

Just as you said we can introduce new FilterBlobByKey interface to optimize it:

  • For the bitmap type, we can return kUndetermined in FilterBlobByKey to read the value further.
  • For the other types, we just need to read the key through FilterBlobByKey.

caipengbo avatar Sep 19 '22 07:09 caipengbo

@caipengbo so, I can create PR based on my commit? Should it be optional or nothing to worry about bitmap?

tufitko avatar Sep 19 '22 10:09 tufitko

@tufitko Good job! I don't think there's anything to worry about with bitmap, just do it!

caipengbo avatar Sep 19 '22 11:09 caipengbo