Feature Request - Batch Deployment for Historical Nodes
Problem Statement In many Apache Druid implementations, replicas are distributed across Availability Zones (AZs) with each AZ containing its own historical tier. For large Druid clusters, this implies that there are multiple historicals per tier. With every upgrade via a rolling restart, it takes hours for the deployment to complete. Given that replicas exist across different tiers, this implies that more than one historicals have the potential to be taken down and rolled out within the same historical tier.
Solution Overview We have implemented a custom batch deployment feature for the Druid Operator that allows users to specify how many historicals can be taken down simultaneously during rolling updates, significantly reducing deployment time while maintaining data availability.
Key Features
- Configurable Batch Sizes
- Specify percentage of pods to delete in parallel per historical tier
- Percentage-based calculation scales automatically with cluster size
- Health-Aware Operations
- Validates other historical tiers are healthy before proceeding, preventing datasource unavailability
- Optional health check bypass for urgent deployments
- Persistent State Management
- Tracks operations across reconciliation cycles
- Prevents conflicting batch operations on the same StatefulSet
- Safe Deletion Strategy
- Deletes highest ordinal pods first (StatefulSet best practice)
- Waits for pod recreation and readiness before continuing
If this is something that the community is interested in, we can start a conversation with a PR
I had previously solved similar issue, by adding tiers in historicals. As of now operator is not aware of tiers, its only aware of nodeTypes. If we add support for tiers, users can rollout of individual tiers. - Do you have something similar in mind ?
@aruraghuwanshi a the submission of a PR to discuss details would be beneficial any changes can be ported over when the new repo is ported as part of https://github.com/apache/druid/issues/18582.
I had previously solved similar issue, by adding tiers in historicals. As of now operator is not aware of tiers, its only aware of nodeTypes. If we add support for tiers, users can rollout of individual tiers. - Do you have something similar in mind ?
Yeah we also solved the tier by tier rollout in our version of the operator. So what this Feature Request is proposing is a little different compared to the per tier rollout.
Even if we have multiple tiers, lets say tier1 (1 segment replica), tier2 (1 segment replica), tier3 (1 segment replica), if each tier consists of large enough historical numbers, the total deployment time increases significantly if it has to rollout one pod at a time. What the batch deploy logic would essentially do is, it would push the update to a batch of historicals at a time instead of waiting for kubernetes to do the rollout one pod at a time.
Although, I agree that the logic for tier based rollout should probably be merge in first before this PR is opened up. That could probably be a separate PR that comes before this. Lmk your thoughts on this