speedb icon indicating copy to clipboard operation
speedb copied to clipboard

Add a backdoor for running compaction method on demand

Open bosmatt opened this issue 1 year ago • 7 comments

Provide a backdoor for running the compaction method. This can be used to schedule the compaction method periodically by external application.

bosmatt avatar Oct 19 '23 12:10 bosmatt

Can we add kafka-streams label?

mjsax avatar Oct 19 '23 15:10 mjsax

Added. However - it has other usages outside of Kafka-Streams as well.

Guyme avatar Oct 19 '23 15:10 Guyme

@bosmatt - Could you elaborate on the motivation for this feature? There is manual compaction (now both blocking and non-blocking). Is it not sufficient? Why?

udi-speedb avatar Oct 20 '23 04:10 udi-speedb

One of our Kafka Streams users would like to schedule periodic compaction based on calendar dates and times. For example, they want to schedule compactions on the weekends and/or in the night hours when their system has less load. With manual compaction, we could theoretically also implement this calendar based compaction triggers in Kafka Streams, I guess. However, as @Guyme pointed out it might also be interesting for other users outside of Kafka Streams.

cadonna avatar Oct 20 '23 08:10 cadonna

Ah, wait. Reading again the title of the issue, I am not sure if we are on the same page. I assumed this feature is about RocksDB/Speedb triggering the compaction based on a calendar. However, the title suggest adding a method for runnig compactions. If it is about the latter than -- as @udi-speedb -- I am also wondering whether manual compaction might be sufficient.

cadonna avatar Oct 20 '23 08:10 cadonna

Once there is a backdoor for running compaction you can run it periodically using external tools/scripts/code. @cadonna This indeed should give an answer to your request, and help others as well. Please share your thoughts on this solution.

bosmatt avatar Oct 31 '23 13:10 bosmatt

@bosmatt Do I understand you correctly that this backdoor would allow -- for example -- a operation system cron job to trigger the compaction? That would be different from running manual compaction by calling compactRange() from within Kafka Streams.

Originally, I envisioned this feature as a config in Speedb/RocksDB that allows to pass in a schedule (for example in cron format) that is used to trigger compactions at the specified times. WDYT?

cadonna avatar Nov 03 '23 10:11 cadonna