Global limit for MSQ controller tasks implemented
Improvements
Implemented a way to limit on the number of query controller tasks running at the same time. This limit specifies what percentage or amount of task slots can be allocated to query controllers. If the limit is reached, the tasks would wait for resources instead of potentially blocking the execution of other tasks (and failing after a timeout).
Rationale
There is no mechanism in Druid to prevent the cluster from being overloaded with controller tasks. Currently, it could cause a significant slowdown in processing and may lead to temporary deadlock situations.
Introduced new configuration options
-
druid.indexer.queue.controllerTaskSlotRatio- optional value which defines the proportion of available task slots that can be allocated to msq controller tasks. This is a floating-point value between 0 and 1. Defaults to null. -
druid.indexer.queue.maxControllerTaskSlots- optional value which specifies the maximum number of task slots that can be allocated to controller tasks. This is an integer value that provides a hard limit on the number of task slots available for msq controller tasks. Defaults to null.
This PR has:
- [x] been self-reviewed.
- [x] added documentation for new or modified features or behaviors.
- [x] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
Remove useless tail
Done, thanks
Thanks @kfaraz, @cryptoe for your comments
The most trivial deadlock scenario occurs when we queue a group of controller tasks but don't have available task slots for actual workers. This results in tasks hanging and eventually timing out. Thanks for highlighting the WorkerTaskRunnerConfig, it seems like the great place for this configuration. I've updated the behavior and merged it with parallelIndexTaskSlotRatio, now it's more flexible, also I've added the logging to inform the user why the task is Pending
@cryptoe , do you have any further comments on this PR?
@nozjkoitop , sorry I had missed these null checks in the earlier review. Could you please include these? I am unable to commit the suggestions myself.
Could you please include these? I am unable to commit the suggestions myself.
Done, thanks
@nozjkoitop , @cryptoe has expressed some concerns around having a per-worker config
He makes some good points here: https://apachedruidworkspace.slack.com/archives/C030CMF6B70/p1727836829640959
He makes some good points here: https://apachedruidworkspace.slack.com/archives/C030CMF6B70/p1727836829640959
I'm afraid I don't have access to this Slack workspace
I'm afraid I don't have access to this Slack workspace
You can try this link https://druid.apache.org/community/join-slack from the Druid docs.
Hey @cryptoe, what do you think about this solution? I've used SelectWorkerStrategies to utilize dynamic configuration of global limits, and it seems like a good option to me.
Apologies, I have been meaning to get to this PR. Will try to finish it by EOW.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
Hi @cryptoe will you have some time to review this changes?
@nozjkoitop Could you please rebase this PR. Will take a look again. Overall LGTM cc @kfaraz
@cryptoe Done
Thanks for your work on this, @nozjkoitop ! I will try to finish the review of this PR later this week.
Thanks for the changes. Lets wait for @kfaraz review and then we can get this merged.