BP-67: Support skipping compaction at busy times
Proposal PR for BP-67. Master Issue: https://github.com/apache/bookkeeper/issues/4382 Implementation PR: https://github.com/apache/bookkeeper/pull/4385
PTAL, thanks. @shoothzj @eolivelli @dlg99 @horizonzy
For example, the cpu.idle will decrease significantly while doing compaction.
I am rather indifferent about the change. There was an attempt in the to implement something similar: #851
It was abandoned, as I remember, because the company where the author worked had chosen to use REST API and cron to run enable/disable compaction.
So you already can disable the compaction on schedule just not via bookie configuration.
I'd support a mechanism that would dynamically pause/resume major/minor compaction if the load on bookie is high (e.g. as tracked by pending requests or request queue latency) if the disk space is enough.
To support dynamically pause/resume major/minor compaction, we need to introduce two interface:
LoadStatCollectorcollect the load stat for other modules.CompactionStrategydecide whether do compaction based on load and any other stats, conf. WDYT @dlg99 @eolivelli @shoothzj
I am rather indifferent about the change. There was an attempt in the to implement something similar: #851 It was abandoned, as I remember, because the company where the author worked had chosen to use REST API and cron to run enable/disable compaction. So you already can disable the compaction on schedule just not via bookie configuration. I'd support a mechanism that would dynamically pause/resume major/minor compaction if the load on bookie is high (e.g. as tracked by pending requests or request queue latency) if the disk space is enough.
To support dynamically pause/resume major/minor compaction, we need to introduce two interface:
LoadStatCollectorcollect the load stat for other modules.CompactionStrategydecide whether do compaction based on load and any other stats, conf. WDYT @dlg99 @eolivelli @shoothzj
@thetumbled suspend and resume compaction based on load will make the system more complex and hard to configure a reasonable threshold. Moreover, if we debug the removed ledger not cleaned up, it's hard to explain why the compaction service was suspended. To reduce the compaction impact in bookie service, we can:
- Introduce accurate throttle for compaction service
- Avoid or suspend the compaction service in high-traffic time
There is one point need to be discussed: we should check it at the beginning of the compaction, but should we check it each time before we try to compact a new entry log? If we stop the compaction at the running time, should we resume this major/minor compaction or just start a new compaction? For example, if the major compaction departs at 0:00 every day and the skip time range is set to 1-3 o'clock, the major will be suspended as soon as it departs, and the major compaction cannot be completed. For users who want to do a complete major compaction, it will become one problem. @hangc0276 @eolivelli @shoothzj @horizonzy