consul-alerts icon indicating copy to clipboard operation
consul-alerts copied to clipboard

Alerts on nodes (serf health check) outage

Open panda87 opened this issue 8 years ago • 6 comments

There is an option to alert when one of the nodes (serf check) is down? This can be very useful to know if one of the masters is down, because if I have 3 servers and one is down, the next server which will collapse will take my cluster down.

Is nodes (serf health check) is part of the alerts?

Thanks D.

panda87 avatar Mar 15 '16 07:03 panda87

+1

I will also add that it would be nice to know what scenarios are triggering alerts. I noticed that if I'm taking one agent and stopping the Consul service on it (as a test), no alert is triggered.

nhproject avatar Mar 15 '16 07:03 nhproject

I am not seeing that in my test. I stop consul and get an alert for serfHealth. If the check is not configured in consul correctly then consul-alerts will not trigger an alert.

fusiondog avatar Mar 17 '16 15:03 fusiondog

@fusiondog I didn't configure any manually check like serfHealth, but when I login to consul ui I can see the serfHealth check activate for each instance, what else should I check?

panda87 avatar Mar 17 '16 15:03 panda87

You might check if somehow it is in maint mode: https://www.consul.io/docs/commands/maint.html

Are there any errors in the consul-alerts output?

fusiondog avatar Mar 22 '16 00:03 fusiondog

I was just testing with maint mode - I would like it to see maintenance mode is enabled instead of showing as failed. just started looking at it to see how much work that is

mcjffld avatar Nov 12 '17 15:11 mcjffld

@mcjffld did you reach some conclusion? Would be really clever to not alert when the service is on maintenance mode.

andrecp avatar Jan 04 '18 17:01 andrecp