tigerbeetle icon indicating copy to clipboard operation
tigerbeetle copied to clipboard

LSM/Compaction: Performance

Open sentientwaffle opened this issue 3 years ago • 1 comments

Background

Requirements

Current Problems

  • Compaction IO is not spread evenly across beats, causing write stalls at the last beat. (Prior beats are best-effort; the last beat must finish the compaction).
  • Compactions of every level (within a tree) append to the tree's manifest log (ManifestLog.append()) concurrently, violating storage determinism.
  • All compactions are concurrent, leading to extra SSD GC (SSD-internal write amplification) and larger FTL maps.

Solutions

  • [x] https://github.com/tigerbeetledb/tigerbeetle/issues/269
  • [x] https://github.com/tigerbeetledb/tigerbeetle/issues/271
  • [x] https://github.com/tigerbeetledb/tigerbeetle/issues/270
  • [ ] https://github.com/tigerbeetledb/tigerbeetle/issues/273
  • [x] https://github.com/tigerbeetledb/tigerbeetle/issues/272
  • [ ] Move data block: Don't implement this yet; pending benchmarks as we get farther along.
  • [x] https://github.com/tigerbeetledb/tigerbeetle/issues/255
    • This avoids mutations to the Account object tree, allowing it to take advantage of the move-table optimization.
  • [x] For compaction, KMergeTree could be replaced by a 2-merge-tree (since we don't need to merge arbitrary levels). ~Only if benchmarks indicate KMergeTree's heap is actually a problem, though.~ (Just use benchmarks as a sanity-check). (KWayMerge will still be used for range queries).

sentientwaffle avatar Nov 22 '22 18:11 sentientwaffle

Have you looked into deamortized compaction, which gives worst-case guarantees for insertions and deletions? It's quite a bit more complicated than the naive approach but seems doable (I plan to implement it for an in-memory LSM but haven't started): https://jeffe.cs.illinois.edu/teaching/datastructures/notes/01-statictodynamic.pdf

senderista avatar Dec 19 '22 21:12 senderista