kvrocks
kvrocks copied to clipboard
Enhance the implementation of Random Sample Keys in kvrocks
Search before asking
- [X] I had searched in the issues and found no similar issues.
Motivation
Random sample keys is supported in kvrocks. When running commands like spop, scanning all data would be a bottleneck. Currently, the implementation is:
- Get all sampled values
- Random filtering K values
- return
Unlike redis, it's complexity is always O(N), N is the element cardinal, which is extremly high. It's a bit hard to get "random" in kvrocks because we're based on rocksdb.
Solution
There're some rules we can apply:
- Maintaining a eq-depth histogram if random is frequently. This is the best one if random sample is frequently. However, we'll suffer from maintaining cost during write. TiKV using size based sampling to maintain the split range
So, instead, maybe we can enhance the implementation of current impl. For example:
- Counting all "indices" of random values, and sorting them
- Using iterator to get only these indices
The complexity is still O(N), but the performance might be enhanced.
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!
I'm trying to implement this. During my impl, I found a problem that:
- Get random set size
- Get all indices, counting the random fetching indices
- Traverse the set
The command 1-3 should be done in one snapshot, otherwise, the case might be:
- Get all indices, size == 10
- Random need to access 10th member
- User delete an element
- 10th cannot be fetched
So, GetMetadata and traversing the set should be done in one rocksdb snapshot. Would you think this is safe and ok? @git-hulk @PragmaTwice
I'm wondering if it'd be better to sacrifice the scope of the candidate set to simplify the implementation like below:
- if size <= N(e.g. 128/256), then iterate all members and randomly pick one of them.
- if size > N, random among the first N elements if the cached random point is empty. Otherwise, seek from the cached random point and then random among the next N elements, then cache the random point again or remove the cache random point if it reaches the end.
if size > N, random among the first N elements if the cached random point is empty. Otherwise, seek from the cached random point and then random among the next N elements, then cache the random point again or remove the cache random point if it reaches the end.
I thought of using this previously, and this is named "statistics" in this issue. but where should we storing this and how do updating this?
but where should we storing this and how do updating this?
We're now using a LRU to cache the iterator key while scanning the DB/hash/set, maybe we can do this in the same way.
That's a bit complex 🤔, in my current impl I may just need to iter the set with one snapshot...
That's a bit complex 🤔, in my current impl I may just need to iter the set with one snapshot...
Yes, it should be more complex than the current implementation. Then I think the implementation with the same snapshot is good.
Yeah, another point is that support get meta by snapshot can enhance our isolation. It might slightly affect LSM-Tree Garbage collection, but I think it's ok
Aha: https://github.com/apache/kvrocks/commit/95301c5a673af83ad85e6dfb63da4db41d83afb8#diff-4bae2c37c513c915c0528134abd47edff2a52833806777feb088b6d09990a74e
@git-hulk should we sample at most 60 keys in spop?
@mapleFU For Kvrocks, I feel good to only randomize part of them if there're too many keys.