rocksdb
rocksdb copied to clipboard
When I use db_paths ops, rocksdb write data to db_paths[0] & db_paths[3] path only
rocks options
options_.db_paths = {{"/mnt/rocksdb/a", 1000 * 1000 * 1000},
{"/mnt/rocksdb/b", 1000 * 1000 * 1000},
{"/mnt/rocksdb/c", 1000 * 1000 * 1000},
{"/mnt/rocksdb/d", 1000 * 1000 * 1000}};
After put some data, /mnt/rocksdb/a & /mnt/rocksdb/d will be writed sst file, but /mnt/rocksdb/a & /mnt/rocksdb/d is empty
du -h /mnt/rocksdb/
4.0K /mnt/rocksdb/b
4.0K /mnt/rocksdb/c
958M /mnt/rocksdb/a
944M /mnt/rocksdb/d
1.9G /mnt/rocksdb/
Expected behavior
According to the document description https://github.com/facebook/rocksdb/blob/main/include/rocksdb/options.h#L672 I think rocksdb will write sst file to /mnt/rocksdb/a => /mnt/rocksdb/b => /mnt/rocksdb/c => /mnt/rocksdb/d
Actual behavior
rocksdb write sst file /mnt/rocksdb/a => /mnt/rocksdb/d,and /mnt/rocksdb/b & /mnt/rocksdb/c be skipped
Steps to reproduce the behavior
// 1. set db_paths
options_.db_paths = {{"/mnt/rocksdb/a", 1000 * 1000 * 1000},
{"/mnt/rocksdb/b", 1000 * 1000 * 1000},
{"/mnt/rocksdb/c", 1000 * 1000 * 1000},
{"/mnt/rocksdb/d", 1000 * 1000 * 1000}};
// 2. opendb
rocksdb::DB::Open(options_, name, column_families, &cf_handles_, &db_);
// 3. put kv
db_->Put(rocksdb::WriteOptions(), key, value);
@xiaobiaozhao Is it possible that all of your data has been compacted into the lowest level? Can you give an example of a complete test and execution showing this issue? I have not had a chance to modify db_bench to show this problem.
@xiaobiaozhao Is it possible that all of your data has been compacted into the lowest level? Can you give an example of a complete test and execution showing this issue? I have not had a chance to modify db_bench to show this problem.
All right, let me prepare a minimal recurrence case
@xiaobiaozhao Is it possible that all of your data has been compacted into the lowest level? Can you give an example of a complete test and execution showing this issue? I have not had a chance to modify db_bench to show this problem.
Demo is here https://gist.github.com/xiaobiaozhao/75a0f6d3d3b3f564e28eacd9b85d3c1a
Hi, any updates? Maybe in current 8.3.х this bug are fixed?
This is rocksdb v8.3.2
du -h /mnt/rocksdb/
471M /mnt/rocksdb/a
4.0K /mnt/rocksdb/c
1.5G /mnt/rocksdb/d
4.0K /mnt/rocksdb/b
@xiaobiaozhao
please check the option: level_compaction_dynamic_level_bytes in rocksdb v8.3.2, the level_compaction_dynamic_level_bytes may be true
如果打开level_compaction_dynamic_level_bytes,则目标层会从默认的Level 1 变成最高层 Level 6,即最开始Level 0会直接compact到Level 6,如果某次compact后,Level 6大小超过256M(target_file_size_base),假设300M,则base_level向上调整,此时base_level变成Level 5,而Level 5的大小上限是300M/10 = 30M,之后Level 0会直接compact到Level 5,如果Level 5超过30M,假设50M,则需要与Level 6进行compact,compact后,Level 5恢复到30M以下,Level 6稍微变大,假设320M,则基于320M继续调整base_level,即Level 5的大小上限,调整为320M/10 = 32M,随着写入持续进行,最终Level 5会超过256M(target_file_size_base),此时base_level需要继续上调,到Level 4,取Level 5和Level 6当前大小较大者,记为MaxSize,则Level 4的大小上限为MaxSize/100,Level 5的大小上限为Level 4大小上限乘以10,依次类推。 相关代码在VersionStorageInfo::CalculateBaseBytes。
@xiaobiaozhao
please check the option: level_compaction_dynamic_level_bytes in rocksdb v8.3.2, the level_compaction_dynamic_level_bytes may be true
如果打开level_compaction_dynamic_level_bytes,则目标层会从默认的Level 1 变成最高层 Level 6,即最开始Level 0会直接compact到Level 6,如果某次compact后,Level 6大小超过256M(target_file_size_base),假设300M,则base_level向上调整,此时base_level变成Level 5,而Level 5的大小上限是300M/10 = 30M,之后Level 0会直接compact到Level 5,如果Level 5超过30M,假设50M,则需要与Level 6进行compact,compact后,Level 5恢复到30M以下,Level 6稍微变大,假设320M,则基于320M继续调整base_level,即Level 5的大小上限,调整为320M/10 = 32M,随着写入持续进行,最终Level 5会超过256M(target_file_size_base),此时base_level需要继续上调,到Level 4,取Level 5和Level 6当前大小较大者,记为MaxSize,则Level 4的大小上限为MaxSize/100,Level 5的大小上限为Level 4大小上限乘以10,依次类推。 相关代码在VersionStorageInfo::CalculateBaseBytes。
I'll test it when I have time.