sentry
sentry copied to clipboard
Reintroduce split queues to reduce load on a single Celery queue
https://github.com/getsentry/sentry/pull/76410 introduces a way to deliver messages to a set of split queues rather than relying on a single queue for post process.
This was used to address an incident as we were saturating Rabbit resources for a single queue (which is single threaded). Splitting messages across multiple queues solves that problem.
This PR reintroduce the split queue support for tasks where we set the queue in the task definition.
It introduces a router SplitQueueTaskRouter
. This is a router used directly by Celery and it maps a task name to a queue. The routing policy is the same as the one employed by SplitQueueRouter
. A list of split queues is defined in settings CELERY_SPLIT_QUEUE_TASK_ROUTES
.
Rollout procedure.
Phase 1: Verify SplitQueueTaskRouter
does not break anything.
- Set up the router in S4S via the prod configuration
- If everything is fine turn it on everywhere. There will be no configuration so this should be a noop.
Phase 2 (s4s first): Set up split queues for save_event_transactions in prod.
- In one PR: remove the queue name from the task and add the task to
CELERY_SPLIT_QUEUE_TASK_ROUTES
with an empty config. This cannot be feature flagged because the removal of the queue from the task is a code change. The router would be mostly a noop as the rollout of the split queue is done via option - Create the queues for the task in
CELERY_SPLIT_QUEUE_TASK_ROUTES
in the affected environments. - Change the config of the workers in prod to listen to all queues.
- Rollout the task via the
celery_split_queue_task_rollout
option. Only at this point the producer starts routing messages to the split queues
Rollout procedure for a new split queue:
- Sentry PR: Declare the split queues, add the queues in the setting together with the default queue (non split), remove the queue name from the task definition. This is a noop change.
- Change to the production configuration to make the workers consume the split queues as well.
- option automator PR: rollout the router on the split queue.