mini-lsm icon indicating copy to clipboard operation
mini-lsm copied to clipboard

fix docs: change some description in w2d3 documentation for tiered compaction trigger condition

Open jackyYang6 opened this issue 1 year ago • 3 comments

  1. compaction trigger precondition The original text stated that compaction triggers when the number of tiers is 'larger than' num_tiers, which could be interpreted as strictly greater than (>). This has been corrected to 'greater than or equal to (or can say reaches)' to accurately reflect the intended behavior and align with the actual implementation.
  2. min_merge_width condition same as above, it should be 'at least' not 'more than'
  3. reduce sorted run conditions accurately states the aim to reduce tiers to num_tiers - 1

jackyYang6 avatar Jun 29 '24 04:06 jackyYang6

Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.

skyzh avatar Jul 02 '24 23:07 skyzh

Is the changes from your output? I'm currently investigating issues related to tier/level id, there might be some mismatches between the write-up and the actual code.

In your tutorial, we always set the tier/level id as the first sst id in this tier/level. But when you use the compaction simulator, choose the --dump-real-id option or the default --dump-origin-id option, it will be different. In compaction simulator, u code after apply compaction result, write like this

for file in files {
    let new_sst_id = storage.generate_sst_id();
    sst_ids.push(new_sst_id);
    storage.file_list.insert(new_sst_id, *file);
    storage.total_writes += 1;
}

It's not change the actual file in file_list. so when we use the compaction simulator on default option (dump origin id, it will be use file, it's not change when we apply result),

println!(
    "L{level} ({}): {:?}",
    files.len(),
    files.iter().map(|x| self.file_list[x]).collect::<Vec<_>>()
);

the output will be just like current example in docs

--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction trigger by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [3, 2, 1]

and then, when we use the '--dump-real-id` option, it will use those new generated sst ids; at this time, tier id will observe the rule.

println!("L{level} ({}): {:?}", files.len(), files);

this output maybe more readable and meet our expectations.

--- After Flush ---
L3 (1): [3]
L2 (1): [2]
L1 (1): [1]
--- Compaction Task ---
--- Compaction Task ---
compaction triggered by space amplification ratio: 200
L3 [3] L2 [2] L1 [1] -> [4, 5, 6]
--- After Compaction ---
L4 (3): [4, 5, 6]

jackyYang6 avatar Jul 03 '24 01:07 jackyYang6

So my changes in the example aren't absolutely right. It simply changes the tier id to make it observe tier id equal to the first sst id in the tier.

jackyYang6 avatar Jul 03 '24 01:07 jackyYang6

Hey -- we have updated docs wrt tiered compaction last year, and I guess all the issues should be fixed :) Thank you for the patch anyways!

skyzh avatar May 31 '25 06:05 skyzh