squad icon indicating copy to clipboard operation
squad copied to clipboard

Add number of messages in queues in system status page

Open chaws opened this issue 3 years ago • 0 comments

This will be useful for monitoring squad's health.

Here are the steps for doing so:

  1. RabbitMQ's host:
    • Open ports 15672 and 15671 only for squad-frontend
    • Enable management plugin sudo rabbitmq-plugins enable rabbitmq_management
  2. Squad-frontent
    • GET rabbitmq's host c = requests.get('http://<private-ip>:15672/api/queues/%2f/ci_fetch/', auth = HTTPBasicAuth('guest', 'guest'))
    • Get number of message (check if keys exist, they might not if queue is empty): c.json()['message_stats']['publish']

There's probably an endpoint to list all the queues, so we don't hard-code queues.

Then add a grafana dash grabbing that endpoint and create an alert if queues grows too much (probably due to worker outage)

  • ci_fetch shouldn't have more than 10k tasks
  • celery shouldn't have more than 300k tasks

chaws avatar Jun 09 '21 18:06 chaws