horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Tracking issue for compaction offload

Open Rachelint opened this issue 1 year ago • 0 comments

Describe This Problem

We found in production that the speed of sst compaction is unable to keep up with the speed of sst generation, leading to poor query performance... However we are unable give more resource to compaction to solve the problem because query/write is more important than compaction in the same node. It is really hard to do a trade-off about resource allocation among query, write and compaction in lsm model... We want to compact the generated small ssts as fast as possible, but we can't tolerate its influence to query/write. And finally I think offload the compaction to the seperated nodes may be the key for it.

Proposal

For supporting compaction offload, we need:

  • Special node supporting remote compaction service

    • [ ] Impl compaction service
  • Horaedb node supports submitting the real compaction node to remote

    • [x] Refactor the compaction process and define necessary traits
    • [ ] Impl remote mode compactor based on traits above
  • Horaemeta supports managing the special compaction nodes

    • [ ] Impl the ability to manage the compaction nodes
    • [ ] Expose the api for horaedb node to get the proper remote compaction ndoe

Additional Context

No response

Rachelint avatar Feb 02 '24 06:02 Rachelint