mini-lsm
mini-lsm copied to clipboard
fix docs: change some description in w2d3 documentation for tiered compaction trigger condition
- compaction trigger precondition The original text stated that compaction triggers when the number of tiers is 'larger than' num_tiers, which could be interpreted as strictly greater than (>). This has been corrected to 'greater than or equal to (or can say reaches)' to accurately reflect the intended behavior and align with the actual implementation.
- min_merge_width condition same as above, it should be 'at least' not 'more than'
- reduce sorted run conditions
accurately states the aim to reduce tiers to
num_tiers - 1
Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.
Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.
In your tutorial, we always set the tier/level id as the first sst id in this tier/level. But when you use the compaction simulator, choose the --dump-real-id option or the default --dump-origin-id option, it will be different. In compaction simulator, u code after apply compaction result, write like this
for file in files {
let new_sst_id = storage.generate_sst_id();
sst_ids.push(new_sst_id);
storage.file_list.insert(new_sst_id, *file);
storage.total_writes += 1;
}
It's not change the actual file in file_list.
so when we use the compaction simulator on default option (dump origin id, it will be use file, it's not change when we apply result),
println!(
"L{level} ({}): {:?}",
files.len(),
files.iter().map(|x| self.file_list[x]).collect::<Vec<_>>()
);
the output will be just like current example in docs
--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction trigger by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [3, 2, 1]
and then, when we use the '--dump-real-id` option, it will use those new generated sst ids; at this time, tier id will observe the rule.
println!("L{level} ({}): {:?}", files.len(), files);
this output maybe more readable and meet our expectations.
--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction triggered by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [4, 5, 6]
So my changes in the example aren't absolutely right. It simply changes the tier id to make it observe tier id equal to the first sst id in the tier.
Hey -- we have updated docs wrt tiered compaction last year, and I guess all the issues should be fixed :) Thank you for the patch anyways!