kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

Delete expired big keys during first query

Open Phoeniwx opened this issue 3 years ago • 1 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Motivation

We encountered a slow log problem when accessing a rather small key. This key is in cold data and kvrocks took 0.82s to get it every time.

10.x.x.x:6379> strlen "scene:xxx:-1"
(integer) 0
(0.82s)

Here is the perflog. We can see it took most time in reading block phase, which processed 73752672 bytes. Every time we accessed the key, same thing happened, until a compaction.

10.x.x.x:6379> perflog get *
1) 1) (integer) 1
   2) (integer) 1662454072
   3) "strlen"
   4) (integer) 633163
   5) "user_key_comparison_count = 76, block_read_count = 1, block_read_byte = 73752672, block_read_time = 59590859, block_checksum_time = 9643232, block_decompress_time = 350221281, get_read_bytes = 247610038, get_snapshot_time = 883, get_from_memtable_time = 4306, get_from_memtable_count = 1, get_post_process_time = 1681, get_from_output_files_time = 426225825, new_table_block_iter_nanos = 426163267, block_seek_nanos = 36515, bloom_sst_hit_count = 1, bloom_sst_miss_count = 2, "
   6) "thread_pool_id = 4, bytes_read = 73752672, read_nanos = 59588262, "

After discussing with developers, we found this is caused by a expired big key. In every querying, kvrocks accessed the big key to retrive its ttl, which took a lot of IO time.

Solution

Maybe we can solve this problem by deleting these kind of expired big keys during the first querying, which should avoid accessing bigkeys every time. Or there may be a better solution. Hope you guys can solve this, thanks.

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

Phoeniwx avatar Sep 15 '22 03:09 Phoeniwx

Thanks for your feedback. Maybe we need a smarter compaction strategy which can compact range in-flight when finding there're too many deleted or expired tombstones.

git-hulk avatar Sep 15 '22 04:09 git-hulk