consul-alerts
consul-alerts copied to clipboard
Alerts on nodes (serf health check) outage
There is an option to alert when one of the nodes (serf check) is down? This can be very useful to know if one of the masters is down, because if I have 3 servers and one is down, the next server which will collapse will take my cluster down.
Is nodes (serf health check) is part of the alerts?
Thanks D.
+1
I will also add that it would be nice to know what scenarios are triggering alerts. I noticed that if I'm taking one agent and stopping the Consul service on it (as a test), no alert is triggered.
I am not seeing that in my test. I stop consul and get an alert for serfHealth. If the check is not configured in consul correctly then consul-alerts will not trigger an alert.
@fusiondog I didn't configure any manually check like serfHealth, but when I login to consul ui I can see the serfHealth check activate for each instance, what else should I check?
You might check if somehow it is in maint mode: https://www.consul.io/docs/commands/maint.html
Are there any errors in the consul-alerts output?
I was just testing with maint mode - I would like it to see maintenance mode is enabled instead of showing as failed. just started looking at it to see how much work that is
@mcjffld did you reach some conclusion? Would be really clever to not alert when the service is on maintenance mode.