druid icon indicating copy to clipboard operation
druid copied to clipboard

Auto-Compaction parent tasks not started if any are still running

Open gl opened this issue 1 year ago • 1 comments

Affected Version

Druid 27

Description

We are running a very large Druid cluster and we have 8 autocompaction tasks ("compact" type) running in parallel to consolidate a kafka-based datasource. Although the compaction tasks don't have the same duration, we have noticed that they are always in sync, meaning that there is no new compact task started if there's at least one still running. This is technically ok, but it's a massively inefficient use of CPU resources.

For clarity, the compaction tasks duration is roughly 4-5 hours, and druid.coordinator.period.indexingPeriod=PT10M.

It doesn't look like there is any way to tune this, and the behaviour seems undocumented as far as I can read, so as it's not expected, I'm filling it as a PR.

image

gl avatar Sep 13 '23 09:09 gl