mimir icon indicating copy to clipboard operation
mimir copied to clipboard

feat(alertmanager): support loki alerts GeneratorURL in template functions

Open fgouteroux opened this issue 2 years ago • 8 comments

Hello,

We used mimir (v2.10.0) and loki (2.9.1), and the loki ruler is configured to send alerts to mimir alertmanager.

In my mimir alertmanager logs, we got this error since the loki upgrade to 2.9.1:

Oct 05 09:28:39 mimiralert01.example.com mimir[2867549]: {"caller":"dispatch.go:352","component":"dispatcher","err":"test_preprod_receiver/pagerduty[0]: notify retry canceled due to unrecoverable error after 1 attempts: \"grafana_link\": failed to template \"{{ grafanaExploreURL \\\"https://my-grafana.example.com\\\" \\\"prometheus\\\" \\\"now-1h\\\" \\\"now\\\" (queryFromGeneratorURL (index .Alerts 0).GeneratorURL) }}\": template: :1:105: executing \"\" at <queryFromGeneratorURL (index .Alerts 0).GeneratorURL>: error calling queryFromGeneratorURL: query not found in the generator URL","insight":"true","level":"error","msg":"Notify for alerts failed","num_alerts":1,"ts":"2023-10-05T09:28:39.164215296Z","user":"test"}

Before the PR https://github.com/grafana/loki/pull/8500 the loki GeneratorURL was starting with /graph?g0.expr and now it is /explore?left=, so now we got an error for loki alerts, caused by https://github.com/grafana/mimir/blob/main/pkg/alertmanager/alertmanager_template.go#L68

I would like to extend the mimir alertmanager templating function for grafanaExploreURL and queryFromGeneratorURL and support the generator url from loki. But I'm not sure that my implementation is the good way, as we can be confused with the alertmanager template function usage like:

grafana_link = "{{ grafanaExploreURL \"https://my-grafana.example.com\" \"prometheus\" \"now-1h\" \"now\" (queryFromGeneratorURL (index .Alerts 0).GeneratorURL) }}

As it mean that all grafana link will be use the prometheus datasource, except for alerts coming from loki.

Any helps appreciated.

@FUSAKLA as you did a good work with theses functions, maybe you can have a look ?

Checklist

  • [x] Tests updated
  • [ ] Documentation added
  • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

fgouteroux avatar Oct 06 '23 12:10 fgouteroux

hello, any thoughts about this ?

fgouteroux avatar Oct 19 '23 06:10 fgouteroux

hello, no feedback on this ?

fgouteroux avatar Nov 06 '23 16:11 fgouteroux

Hi @fgouteroux - thanks for the PR. I'm taking a look now.

stevesg avatar Nov 09 '23 12:11 stevesg

If I understand right, what we want to accomplish is if the system (e.g. Loki) sending alerts already provide a Grafana Explore URL, then we want to pass that through unchanged.

I see the existing functions we have aren't really suitable for this purpose though. If we change queryFromGeneratorURL to pass-through the URL, then that's confusing given the name of the function. The expectation is that a query string is always returned, not a URL.

I'm not sure the existing functions can satisfy the need for passing through URLs, without breaking someone. So the options might well be:

  1. Extend the existing functions following the existing pattern, but accept that we lose any "richness" from the URL (the big issue here I see being the datasource UID)
  2. Adding a new function, that takes an arbitrary "GeneratorURL" and emits the "Explore URL". A descriptive but overly verbose name might be grafanaExploreURLFromGeneratorURL.
  3. Fix it in the template itself (e.g. do the if prefixed ...)

I'll think on this a little more.

stevesg avatar Nov 09 '23 12:11 stevesg

hi, anything I can do to move forward on this PR ?

fgouteroux avatar Mar 04 '24 09:03 fgouteroux

Hi, any progress on the right direction about this ?

fgouteroux avatar Apr 09 '24 17:04 fgouteroux