horaedb
horaedb copied to clipboard
Tracking issue for compaction offload
Describe This Problem
We found in production that the speed of sst compaction is unable to keep up with the speed of sst generation, leading to poor query performance... However we are unable give more resource to compaction to solve the problem because query/write is more important than compaction in the same node. It is really hard to do a trade-off about resource allocation among query, write and compaction in lsm model... We want to compact the generated small ssts as fast as possible, but we can't tolerate its influence to query/write. And finally I think offload the compaction to the seperated nodes may be the key for it.
Proposal
For supporting compaction offload, we need:
-
Special node supporting remote compaction service
- [ ] Impl compaction service
-
Horaedb node supports submitting the real compaction node to remote
- [x] Refactor the compaction process and define necessary traits
- [ ] Impl remote mode compactor based on traits above
-
Horaemeta supports managing the special compaction nodes
- [ ] Impl the ability to manage the compaction nodes
- [ ] Expose the api for horaedb node to get the proper remote compaction ndoe
Additional Context
No response