alertmanager icon indicating copy to clipboard operation
alertmanager copied to clipboard

Feature request: Adding metadata to incomming alerts

Open FUSAKLA opened this issue 3 years ago • 3 comments

Motivation

Hi, we use AM heavily and do have a lot of teams each managing their own multiple services running in Kubernetes.

To make things easier, we often have some generic alerts across all the services (not up, pod is pending, image pull back off, high error rate). We tend to add a playbook link to the alerts, and it's also handy to have a link to a documentation of the affected service, it's repository etc. Particularly in case of such generic alerts, the default playbook would not have any info for the particular service what the impact is etc.

Problem

Currently, we would have to generate the Go template somehow with handful of ifs or have the alert separately for each of the service.

Suggestion

Allow AM to add metadata to incoming alerts based on static (or even dynamic) configuration. It would simply check if the alert matches the selector and if so, add the configured metadata to it. This metadata would be mainly annotations but possibly also labels? If the evaluation happened before going to the routing tree, it would be possible also to use the additional labels for the routing.

Static

First thing that comes to my mind would be some new type of "rules" like metadata rules, similarly as inhibition rules.

additional_metadata_rule:
  - matches:
       - foo =~ bar
    overwrite_existing: false  # How to handle colisions
    annotations:
       docs: http:/foo.bar/docs
    labels: # Allow even adding labels too?
       team: bar

Dynamic

Event more interesting could be the possibility to load such metadata rules form some remote catalog of applications/services. Unfortunately, I'm not aware of any standardized form of such thing (possibly the Backstage Service Catalog). But could be interesting for the future development.

Concerns

  • How to deal with collisions (as suggested, could be configurable if it should overwrite or not)
  • If labels are added to the alert, user could try to search for the alert in the ALERTS metric using those and would fail, could be unclear where those came from.
  • Change of the added label during the phase when the alert is still active could lead to some inconsistencies (not sure how Alertmanager manages "grouping" of incoming alerts) would be added/changed.

Alternatives

  • Doing this in the Go template of the notification text
    • Generating the Go template would be really hard to maintain and read
  • Adding the metadata in the alert receiver (PagerDuty, OpsGenie, ...)
    • Would need to add this functionality to each one of the target integrations
    • In some cases the notification does not go through any other system (Email, Slack, API ...)
  • Doing this even sooner in the Prometheus before sending the alert out, same as the alert relabel does.

FUSAKLA avatar Jun 05 '21 06:06 FUSAKLA

The canonical way to solve this is to use the external labels in Prometheus.

roidelapluie avatar Jun 05 '21 07:06 roidelapluie

To use external labels for adding metadata such as link to the application documentation I'd need to have a separate instance of prometheus for each app. That does not sound viable since we have over hundret of microservices :/

FUSAKLA avatar Jun 05 '21 11:06 FUSAKLA

I also have same request, it is relabel feature at alertmanager level.

hanjm avatar May 22 '22 09:05 hanjm

I also have same request, it is relabel feature at alertmanager level.

zgfh avatar Nov 02 '22 11:11 zgfh