uptime-kuma
uptime-kuma copied to clipboard
Add RabbitMQ Monitoring Support to Uptime Kuma
📑 I have found these related issues/pull requests
I could not find any open features to this
🏷️ Feature Request Type
API / automation options, New monitor
🔖 Feature description
Feature Description:
Add native support for RabbitMQ monitoring in Uptime Kuma. This feature will enable users to monitor the health and performance of RabbitMQ message brokers by integrating with RabbitMQ’s HTTP API. Key functionalities should include:
- Health Checks: Verify operational status and connectivity of RabbitMQ nodes.
- Metrics Collection: Track essential metrics such as queue lengths, message rates, consumer counts, and resource usage.
- Alert Configuration: Allow users to set and customize thresholds for alerts based on collected metrics.
This addition will help users maintain the health of their RabbitMQ services, detect potential issues proactively, and customize monitoring to fit their specific needs.
✔️ Solution
Implement a RabbitMQ monitoring module in Uptime Kuma that integrates with RabbitMQ’s HTTP API. The module should include:
- Health Checks: Ensure RabbitMQ nodes are operational and responsive.
- Metrics Collection: Retrieve and display key metrics such as queue lengths, message rates, consumer counts, and resource utilization.
- Alert Configuration: Enable users to set and customize alert thresholds for the collected metrics.
This solution ought to provide users with better insights into their RabbitMQ instances and help them address issues before they impact system performance.
❓ Alternatives
I have written custom health checks in GoLang previously that we use to check the cluster node health. My approach personally was an automated one to reduce the difficulty around configurations.
It's made up of
- Query the overview api endpoint (
/api/overview
)- When you query the /api/overview endpoint you get a big object back, I filter down to the
{ "context": [{...}] }
array that is in the root. From there I grabcontext[x].node
string and then use that in the next step.
- When you query the /api/overview endpoint you get a big object back, I filter down to the
- For each node listed in the response json body:
- Query: "/api/healthchecks/node/" +
context[x].node
-- essentially it'll be that [email protected] - If response is 200 (aka node responded) check the json body, else it's unhealthy
- If response body is
{ "status": "ok" }
all is well, else unhealthy.
- Query: "/api/healthchecks/node/" +
I then check basically what's the % of nodes online, I can in my little dummy check set whether or not the service is unhealthy
or degraded
or healthy
.
A final alternative proposal
Alternatively, since it's just API JSON requests... perhaps adding the ability to script write some low level JavaScript or something to evaluate responses? Like if I get a response perhaps pick a property in a json response to iterate over and do something like make another call and then evaluate the response either for a json property being a value or an http status?
📝 Additional Context
Some useful context... I'd say is probably the important parts of the actual APIs.
I know with RabbitMQ authenticating with just a user using Basic Auth is sufficient. I think that's good spot to just start.
/api/overview
{
"contexts": [
{
"ssl_opts": { "keyfile": "", "cacertfile": "", "certfile": "" },
"node": "[email protected]",
"description": "RabbitQM Management",
"path": "/",
"cowboy_opts": "[{sendfile,false}]"
"port": "15671",
"ssl": "true"
},
{
"ssl_opts": { "keyfile": "", "cacertfile": "", "certfile": "" },
"node": "[email protected]",
"description": "RabbitQM Management",
"path": "/",
"cowboy_opts": "[{sendfile,false}]"
"port": "15671",
"ssl": "true"
},
{
"ssl_opts": { "keyfile": "", "cacertfile": "", "certfile": "" },
"node": "[email protected]",
"description": "RabbitQM Management",
"path": "/",
"cowboy_opts": "[{sendfile,false}]"
"port": "15671",
"ssl": "true"
}
]
}
There's also other potentially useful stats in here but the important part for the health check itself to see if nodes are up or down is the context
.
/api/healthcheck/node/[email protected]
{ "status": "ok" }