alertmanager
alertmanager copied to clipboard
Be able to silence based on generatorURL
The primary purpose of generatorURL is to be able to silence on it, this is important when you've a broken/inaccessible Prometheus server that continues to send incorrect alerts. This should be an additional option for silences, potentially hidden in an advanced section.
+1 I needed exactly this the other day.
Not objecting to being able to silence on it, but to the statement that that is the primary purpose of the generatorURL :) My intent when sending this field to the Alertmanager was to have a nice link back from alerts/notifications to the Prometheus server graphing interface, so that one can easily play with the alerting expression.
The silence is the primary purpose, and quite some time was spent arguing that it should be a separate field rather than a label or annotation due to the required semantics of the silence and de-duplication.
Technically it should be two fields however with that and the link being combined into one it's much less likely that other am client implementations would skip over it not realising why it's so important.
Since generatorURL is outside of labels (which is what are used for matching currently) do we want to expand this feature request to be able to silence based on any part of a alert?
I can see that receivers might be useful to silence against, maybe fingerprint(?), but to be honest i'm struggling to come up with a silence usecase for any of the other fields.
Also in my experience I haven't ever hit something that would make silencing by generatorURL useful. Usually I will have labels that convey enough of the same information that it's not needed.
I don't see a need to silence on anything else.
Usually I will have labels that convey enough of the same information that it's not needed.
The whole point is for cases where labels aren't sufficient, such as a HA pair which are producing identical alert labels but one of them is dodgy.
I have similar requirement, want to inhibit notification from second replica of Prometheus. Any update on this issue ??
I recently had a case where silencing based on the generatorURL(or anything related peer) would have made a big difference. Is there still interest to implement it? What do you think of the approach taken here?
Was expecting a __source__ or something. The alert seems to have the source, if you could write a regex to it it would allow for a pretty useful silence.
Bump on this very old issue. Does this functionality exist in some other way? Being able to silence based on the source of the alert would be really useful.
Does this functionality exist in some other way? Being able to silence based on the source of the alert would be really useful.
At least in many cases you should be able to silence based on the external identifying labels (global.external_labels in the Prometheus config file) of the sending Prometheus server, but any replica labels that distinguish multiple HA replicas of the same type would already be removed from the alert (via alert relabeling) before sending it to the Alertmanager (otherwise the alert can't be deduplicated), so that doesn't work anymore if only a specific replica of an HA pair is acting up.
Other than that, I think the initial feature request here is still desired and just needs someone to design & implement it properly. Since the generator URL currently includes not only the sending Prometheus URL itself, but also a path element that leads to a graph page, maybe it would even make sense to break that generator information up semantically a bit more into separate JSON fields (of course keeping the old generatorURL for backwards compatibility).
Disclaimer: I'm not an Alertmanager maintainer or much involved in its development anymore at the moment.
I've thought about having separate alert streams per alertname for applying back pressure for noisy alerts, but this can also be implemented in a generic and configurable way so each alert sender would get its own stream and Alertmanager can automatically or manually drop a stream of alerts.
Alert streams could be potentially implemented in the existing API v2 without Prometheus integration or in an API v3 with Prometheus integration, so Prometheus would be aware and backoff if its alert stream gets dropped.
Also note that Cloudflare currently has an internal proxy component called Bouncer which can drop alerts based on alertname and/or labels (while maintaining cardinality by tracking fingerprints), this might be open sourced at some point and could be an optional component to handle dropping alert streams without complicating the implementation on Prometheus and Alertmanager.