alertmanager icon indicating copy to clipboard operation
alertmanager copied to clipboard

alertmanager-config.yaml file for pagerduty_configs.details does not respect EventApi v2 spec

Open NargiT opened this issue 4 years ago • 11 comments

What did you do? Generate a nested object for pagerduty_configs.details field inside alertmanager-config.yaml

What did you expect to see? amtool should not fail and alertmanager must allow nested objects as it's compatible with pagerduty event api v2

What did you see instead? Under which circumstances?

Error when running amtool and alertmanager

Environment

  • System information:

    Linux 5.3.0-46-generic x86_64

  • Alertmanager version:

    alertmanager, version 0.21.0 (branch: HEAD, revision: 4c6c03ebfe21009c546e4d1e9b92c371d67c021d) build user: root@dee35927357f build date: 20200617-08:54:02 go version: go1.14.4

  • Prometheus version:

    insert output of prometheus --version here (repeat for each prometheus version in your cluster, if relevant to the issue)

  • Alertmanager configuration file:

global:
  resolve_timeout: "10m"

templates:
  - '/etc/alertmanager/secrets/pagerduty.integration.key.tmpl'

route:
  group_by: [ '...' ]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h
  receiver: 'black-hole'

  routes:
    - match_re:
        severity: "warning|critical"
      receiver: "all"

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: [ 'alertname' ]

receivers:
  - name: "all"
    pagerduty_configs:
      - routing_key: '{{ template "pagerduty.integration.key" . }}'
        details:
          firing:       '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
          resolved:     '{{ template "pagerduty.default.instances" .Alerts.Resolved }}'
          num_firing:   '{{ .Alerts.Firing | len }}'
          num_resolved: '{{ .Alerts.Resolved | len }}'
          toto: [ "hello", "hoho" ] #this line will fail
  • Logs:
level=error ts=2021-02-05T10:57:07.101Z caller=coordinator.go:124 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/
alertmanager.yaml err="yaml: unmarshal errors:\n  line 91: cannot unmarshal !!seq into string"

This is of course expected as alertmanager is not able to handle the toto list we added, but if I craft the same with a curl and contact directly events.pagerduty.com/v2/enqueue it get what I wanted.

curl -X POST -d '{"routing_key": "XXXX", "event_action": "trigger", "payload": {"summary": "one alert", "source": "curl", "severity": "critical", "custom_details": { "toto": ["hello", "hoho"]}}}' https://events.pagerduty.com/v2/enqueue

NargiT avatar Feb 05 '21 11:02 NargiT

Hmm interesting. The official docs aren't going into this level of details... One thing to resolve for Alertmanager is how templating would work if we use a map[string]interface{} instead of a map[string]string for the details field.

simonpasquier avatar Feb 05 '21 17:02 simonpasquier

Sorry I don't know how I can help with this ?

NargiT avatar Feb 08 '21 06:02 NargiT

This would really help

YGwen avatar Feb 09 '21 15:02 YGwen

This is a recurring nightmare for us too

drewboswell avatar Feb 09 '21 16:02 drewboswell

For folks that are affected by the issue, could you please describe your requirements? If the details field supported nested objects today, how would your config look like?

simonpasquier avatar Feb 22 '21 13:02 simonpasquier

hi @simonpasquier, sorry for the delay.

The PD-CEF state that details fields is a Free-form details from the event. (source: https://support.pagerduty.com/docs/pd-cef)

image

And if we look to json spec for Object, the format can be more than a simple string. (source: https://www.json.org/json-en.html)

image

My problem is that I cannot use nested object details and firing is good example that hit this limitation. Here is an example of the payload received by PagerDuty from AlertManger using Event Api v2. image

Once the event reach PagerDuty, extracing fields becomes a nightmare in PagerDuty Customize Event Fields panel or even do specific routing when using global event rules.

I would like to have the ability to get a full json payload form labels and annotations instead of the raw string with bullet points used by the template because nested object are not supported.

Sorry my knowledge in go is very poor but maybe adding a new field firing-json or a built-in function would transform .Alert.Firing to a json payload ?

I added an example with toto field, for simple testing.

receivers:
  - name: "all"
    pagerduty_configs:
      - details:
           firing-json: '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
// or
           firing: '{{ template "pagerduty.default.instances" .Alerts.Firing | toJson }}'
// moreover the line bellow will fail
           toto:  [ "hello", "hoho" ] 

This could generate the following payload (for simplicity only relevant part is described)

"details" : {
  "firing" : {
     "labels": {
       "alertname" : "KubeContainerWaiting",
       "container" : "logs",
       "namespace" : "test",
       "pod" :  "hello-world-5946fd6b75-4lhv8",
       "prometheus" : "monitoring/k8s",
       "severity" : "warning"
     },
     "annotations" : {
        "message" : "Pod test/hello-world-5946fd6b75-4lhv8 container logs has been in waiting state for longer than 1 hour.",
        "runbook_url" : "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting"
      },
      "source" : "https://test.localhost.ch/prometheus/graph?g0.expr=sum+by%28namespace%2C+pod%2C+container%29+%28kube_pod_container_status_waiting_reason%7Bjob%3D%22kube-state-metrics%22%7D%29+%3E+0&g0.tab=1"
   }
  }
}

NargiT avatar May 20 '21 10:05 NargiT

@simonpasquier do you need more informations ?

NargiT avatar Jun 09 '21 05:06 NargiT

This is also a very large headache for us. Recently we have been trying to parse the details string and its a big nightmare.

christopherbox avatar Oct 08 '21 19:10 christopherbox

Feels so odd to use raw string when you know that everything is json on alert manager side.

NargiT avatar Oct 13 '21 08:10 NargiT

On the topic of PD-CEF and nested objects in custom_details, the specification is frankly unclear, but there's ample evidence in the PagerDuty documentation that nested objects are supported and useful. Here's an example.

EronWright avatar Sep 16 '22 18:09 EronWright

Any news regarding this ?

NargiT avatar Nov 30 '23 07:11 NargiT