kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

Large .sst and .log files (unsure about compression feature)

Open alija83 opened this issue 1 year ago • 2 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

Motivation

I was playing with kvrocks (version 2.8.0) today and my goal was to activate compression and reduce logging. I am not sure if I have enabled configurations correctly but here is what I have noticed.

My goal was to use kvrocks and storing large volumes of KVs while ensuring that that data (KVs) remain compressed and uses least amount of disk.

Here is my configuration:

dir /kvrocks/data

# General
port 6376
bind 0.0.0.0

rocksdb.compression zstd

backup-dir /kvrocks/backup
#log-dir /kvrocks
log-level error
log-dir stdout
log-retention-days 0
pidfile /kvrocks/kvrocks.pid

I tried also the option lz4 on rocksdb.compression but nothing, the .sst file does not look compressed and the .log files are quite large as well. cd db /kvrocks/data/db # du -hs * 4.0K 000019.sst 624.0K 000605.sst 83.9M 002080.sst 70.4M 002082.log 102.6M 002083.sst 16.0G archive

cd archive ls -alsrht

64544 -rw-r--r-- 1 root root 63.0M Jun 14 20:59 002162.log 64448 -rw-r--r-- 1 root root 62.9M Jun 14 20:59 002169.log 64696 -rw-r--r-- 1 root root 63.2M Jun 14 20:59 002165.log

127.0.0.1:6376> keys '*' pattern 16521) ....

it has 16521 records.

16GB is just too much.

Solution

I am not sure on how it should work and be implemented, but what I expected as user is to see .sst files compressed and be able to disable logs if desired.

instead of seeing:

/kvrocks/data # du -hs * 22.5G db

it should have been something like: 25MB

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

alija83 avatar Jun 14 '24 21:06 alija83

There are basically two points in this issue:

  • for .log files, it's actually WAL (write ahead logging) in rocksdb, and you can configure it via some options like rocksdb.wal_ttl_seconds or rocksdb.wal_size_limit_mb.
  • even you specify rocksdb.compression, these existing SST files will not be compressed immediately. And seems compaction is more important for your issue.

cc @git-hulk

PragmaTwice avatar Jun 15 '24 01:06 PragmaTwice

for .log files, it's actually WAL (write ahead logging) in rocksdb, and you can configure it via some options like rocksdb.wal_ttl_seconds or rocksdb.wal_size_limit_mb.

Yes, you can reduce those two if you would like to keep fewer logs.

git-hulk avatar Jun 15 '24 02:06 git-hulk