tigerbeetle
tigerbeetle copied to clipboard
LSM/Compaction: Performance
Background
Requirements
- Compaction has consistent latency — no write stalls.
- LSM storage must be deterministic.
- When possible, design the IO workload for SSDs by grouping writes by time-of-death.
- Minimize write amplification (see https://github.com/tigerbeetledb/tigerbeetle/pull/244).
Current Problems
- Compaction IO is not spread evenly across beats, causing write stalls at the last beat. (Prior beats are best-effort; the last beat must finish the compaction).
- Compactions of every level (within a tree) append to the tree's manifest log (
ManifestLog.append()) concurrently, violating storage determinism. - All compactions are concurrent, leading to extra SSD GC (SSD-internal write amplification) and larger FTL maps.
Solutions
- [x] https://github.com/tigerbeetledb/tigerbeetle/issues/269
- [x] https://github.com/tigerbeetledb/tigerbeetle/issues/271
- [x] https://github.com/tigerbeetledb/tigerbeetle/issues/270
- [ ] https://github.com/tigerbeetledb/tigerbeetle/issues/273
- [x] https://github.com/tigerbeetledb/tigerbeetle/issues/272
- [ ] Move data block: Don't implement this yet; pending benchmarks as we get farther along.
- [x] https://github.com/tigerbeetledb/tigerbeetle/issues/255
- This avoids mutations to the
Accountobject tree, allowing it to take advantage of the move-table optimization.
- This avoids mutations to the
- [x] For compaction,
KMergeTreecould be replaced by a 2-merge-tree (since we don't need to merge arbitrary levels). ~Only if benchmarks indicateKMergeTree's heap is actually a problem, though.~ (Just use benchmarks as a sanity-check). (KWayMergewill still be used for range queries).
Have you looked into deamortized compaction, which gives worst-case guarantees for insertions and deletions? It's quite a bit more complicated than the naive approach but seems doable (I plan to implement it for an in-memory LSM but haven't started): https://jeffe.cs.illinois.edu/teaching/datastructures/notes/01-statictodynamic.pdf