Add a metric with alertmanager_notifications_success_total
What did you do?
I would like to create an availability SLO using (good events)/(total_events) and currently, we don't have a proper way to check what was the notification that was sent with success.
With that in mind, it could be nice to have a alertmanager_notifications_success_total metric, since alertmanager already has alertmanager_notifications_total and alertmanager_notifications_failed_total
What did you expect to see?
A way to identify alerts that was sent with success.
What did you see instead? Under which circumstances?
N/A
Environment
-
System information:
N/A
-
Alertmanager version:
0.24.0
-
Prometheus version: N/A
-
Alertmanager configuration file: N/A
-
Prometheus configuration file: N/A
-
Logs: N/A
Typically, we don't provide success metrics because you can compute them yourself.
You can do success by subtracting total - failure e.g. alertmanager_notifications_total - alertmanager_notifications_failed_total.
The query you're looking for is probably:
(alertmanager_notifications_total - alertmanager_notifications_failed_total) / alertmanager_notifications_total
I think we can close this @roidelapluie and @simonpasquier.
Ok! Thats make a lot of sense, thanks :D