kube-prometheus icon indicating copy to clipboard operation
kube-prometheus copied to clipboard

Use the new kind AlertmanagerConfig to configure alertmanager

Open sereinity opened this issue 4 years ago • 11 comments

There is a lot of users asking on how to configure the alertmanager, I think it will help if kube-prometheus generates this object with some empowerment on user side to fill it.

An advantage to use it is that it can automatically load some secrets to integrate with alert routing providers.

  • kube-prometheus version: at least 0.7 as it requires prometheus-operator ≥ 0.43.0

Using AlertmanagerConfig requires to add some information on the Alertmanager, therefor, users can't use it without some adaptation in kube-prometheus.

sereinity avatar Dec 28 '20 16:12 sereinity

I was wondering about this too.

shinebayar-g avatar Jan 02 '21 04:01 shinebayar-g

Could you elaborate on what you would like to see? It's not entirely clear to me what someone who wanted to work on this would try to implement.

brancz avatar Feb 18 '21 13:02 brancz

After some testing and local POCing, I finally have an answer.

I suggest tree changes (mostly the two first):

  1. We define an alertmanagerConfigSelector in the alertmanager and allow users to specify it (by example in _config+:.alertmanager+:.configSelector)
  2. We define a default value for the selector (example below)
alertmanagerConfigSelector:
  matchLabels:
    alertmanagerConfig: $alertmanager.name # or "main" or whatever
  1. When the operator will allow global AlertmanagerConfig we should then be able to migrate current route, receivers and inhibit_rules in an AlertmanagerConfig resource (mainly useful for users that are looking for how to use this resource kind).

sereinity avatar Feb 19 '21 13:02 sereinity

AlertmanagerConfig selector can be already included as:

alertmanager+: {
  alertmanager+: {
    spec+: {
      alertmanagerConfigSelector: {
        matchLabels: {
          alertmanagerConfig: $alertmanager.alertmanager.metadata.name
        },
      },
    },
  },
}

paulfantom avatar Feb 22 '21 12:02 paulfantom

In my cluster, I have leveraged the AlertmanagerConfig custom resource https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#alertmanagerconfig, which in my understanding, help us change the currently deployed Alertmanager config files at will.

So, to configure my Sendgrid account to deal with desired email integration (a.k.a. Alertmanager sending e-mail alert notifications), I have created a file named AlertManagerConfigmap.yaml with the following:

apiVersion: monitoring.coreos.com/v1alpha1 kind: AlertmanagerConfig metadata: name: example spec: receivers:

  • name: 'email' email_configs:
  • to: '[email protected]' from: '[email protected]' smarthost: smtp.sendgrid.net:587 auth_username: 'apikey' auth_password: 'XXXXXXXXX' route: group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 10s receiver: 'email'

Hoping this configuration will add to the existing one after applying it to the cluster.

Now, when I try to apply it, I got this error:

$ kubectl apply -f monitoring-alertmanager-configmap.yaml

error: error validating "monitoring-alertmanager-configmap.yaml": error validating data: [ValidationError(AlertmanagerConfig.metadata): unknown field "route" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(AlertmanagerConfig.metadata): unknown field "spec" in io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta, ValidationError(AlertmanagerConfig): missing required field "spec" in com.coreos.monitoring.v1alpha1.AlertmanagerConfig]; if you choose to ignore these errors, turn validation off with --validate=false

Someone could guide me in this?

dnaranjor avatar Jun 24 '21 23:06 dnaranjor

@dnaranjor I've success create the AlertmanagerConfig CRD following the doc(https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/alerting.md#alertmanagerconfig-resource).

This is my yaml file, FYI.

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  labels:
    release: kube-prometheus-stack
  name: mail-config
spec:
  route:
    groupBy: ['job']
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 12h
    receiver: 'mail'
    routes:
    - match:
        alertname: Watchdog
      receiver: mail
  receivers:
  - name: mail
    emailConfigs:
    - to: <mail addr>
      from: <mail addr>
      smarthost: smtp.gmail.com:465
      authUsername: ntphrf
      authPassword:
        name: mail-password
        key: password
      requireTLS: false

---

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: mail-password
data:
  password: xxxjixjijxijixxxxxxxx

orcahmlee avatar Jul 05 '21 10:07 orcahmlee

@orcahmlee thanks for sharing... My problem currently is that, even that I have configured the receivers section with my email server settings, those are not showing in my AlertManager Config file, so my guess is they are not being enforced at all.

In your scenario, after applying this AlertmanagerConfig CRD you're sharing, the new emailConfigs information added is showed in the AlertManager GUI > Status > Config section?

dnaranjor avatar Jul 05 '21 14:07 dnaranjor

In your scenario, after applying this AlertmanagerConfig CRD you're sharing, the new emailConfigs information added is showed in the AlertManager GUI > Status > Config section?

Yes, I can see the change after applying.

@orcahmlee thanks for sharing... My problem currently is that, even that I have configured the receivers section with my email server settings, those are not showing in my AlertManager Config file, so my guess is they are not being enforced at all.

@dnaranjor Have you tried check the directives in the yaml file? I realized the directives are different between alertmanager.yml and AlertmanagerConfig CRD. e.g.:

  • email_configs -> emailConfigs
  • auth_username -> authUsername

This was also my mistakes when I applied AlertmanagerConfig CRD first time. Hope this info is useful.

orcahmlee avatar Jul 06 '21 02:07 orcahmlee

I found another very simple but almost perfect workaround that works perfectly - set in value.yaml the following:

prometheus:
    additionalAlertRelabelConfigs:
      - source_labels: [namespace] # adds missing namespace label
        target_label: namespace
        regex: (^$)
        replacement: monitoring # should match namespace where alertmanager deployed

which will be rendered into the following:

alerting:
  alert_relabel_configs:
  - source_labels: [namespace]
    separator: ;
    regex: (^$)
    target_label: namespace
    replacement: monitoring
    action: replace

Vlad1mir-D avatar Dec 02 '21 02:12 Vlad1mir-D

can we use multiple receivers in alertmanagerconfig

sureshkachwa avatar Sep 14 '22 16:09 sureshkachwa

So I've been trying to use this, but it looks like it adds a namespace matcher by default, which I think makes it impossible to just get all alerts caught. For example trying to get the Watchdog :

---
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: am-config-watchdog
  labels:
    alertmanagerConfig: am-config
spec:
  route:
    matchers:
    - name: alertname
      value: Watchdog
      matchType: '='
    groupWait: 30s
    groupInterval: 5m
    repeatInterval: 1h
    receiver: 'dms'
    continue: false
  receivers:
  - name: 'dms'
    webhookConfigs:
    - url: https://...

This seems to resolve in the AM Config as :

  - receiver: monitoring/am-config-watchdog/dms
    matchers:
    - alertname="Watchdog"
    - namespace="monitoring"
    continue: true
    group_wait: 30s
    group_interval: 5m
    repeat_interval: 1h

This namespace match is automatically added, and since the Watchdog does not have a namespace set it doesn't match and just continues on to the default null receiver. It's also overriding the continue to force it to be true, which is a bit frustrating. I've never used AlertManagerConfig before so I may be missing something.

This is all applied using argocd + kustomize, not sure if there's a way to change the default AM config and change the default watchdog route instead, that could be another way to go but it looks like it's in a secret so that doesn't look trivial.

Ulrar avatar Dec 06 '23 12:12 Ulrar