alertmanager icon indicating copy to clipboard operation
alertmanager copied to clipboard

allow user to request consistent ordering / sorting of grouped alerts

Open tzz opened this issue 8 years ago • 17 comments

alertmanager currently groups alerts without a consistent ordering, which is tough to follow for large groups.

The user should be able to specify a sort order on some label or annotation, which would give consistency and locally meaningful sorting. This could be done with a Go sort function in the template or with a alertmanager configuration entry perhaps.

tzz avatar Jan 02 '18 22:01 tzz

I propose that at a minimum the Alerts in the template structure should come with some ordering on it, like we have done elsewhere.

brian-brazil avatar Jan 02 '18 22:01 brian-brazil

Are you referring to the /api/v1/alerts/groups endpoint? we're currently sorting by internal identifier in /api/v1/alerts (https://github.com/prometheus/alertmanager/blob/master/api/api.go#L405-L407)

stuartnelson3 avatar Jan 08 '18 09:01 stuartnelson3

This is about notification templates.

brian-brazil avatar Jan 08 '18 09:01 brian-brazil

I don't mind writing some code if it will help move this forward. For me it would be enough to have a function or alert manager option to simply sort alerts lexicographically by label or annotation.

tzz avatar Jan 31 '18 15:01 tzz

I think we should start with a consistent ordering on the Alerts we provide to notification templates. That way users get something okay without having to do extra work.

brian-brazil avatar Jan 31 '18 17:01 brian-brazil

For a default ordering, how about lexicographical by (alert.annotations.summary, alert.creation_time)? That's the simplest thing I can see that could be generally useful. Or maybe even look for a special sort_key annotation that users can define?

tzz avatar Jan 31 '18 18:01 tzz

You can't presume that any particular annotation exists, nor that creation times are stable. I'd suggest working entirely off alert labels. This should not be configurable at this level.

brian-brazil avatar Jan 31 '18 18:01 brian-brazil

OK, what alert labels would make for a sensible default sort key? (job, instance)?

tzz avatar Jan 31 '18 19:01 tzz

You'll need to use all of them, otherwise it won't be consistent. Moving job and instance to the front is probably wise.

brian-brazil avatar Jan 31 '18 19:01 brian-brazil

Looking at dispatch/dispatch.go, there seem to be two ways to go: either modify aggrGroup to always have a sorted list of Alert structs instead of a map, or sort the alertsSlice when alerts are flushed. The alertsSlice sort seems easier to implement.

Does that make sense or am I misunderstanding your intention or the code?

tzz avatar Jan 31 '18 20:01 tzz

Yes, the slice would be the one to sort somewhere along that codepath.

brian-brazil avatar Jan 31 '18 21:01 brian-brazil

The proposed change is in https://github.com/prometheus/alertmanager/pull/1234 but if Alertmanager will allow arbitrary sorting from the user in the future, then the LabelSet.Before() method should probably be extended to take a list of label names, in which case the change becomes trivial. I didn't propose that API extension because I don't know if it's right, and wanted to keep the scope as small as possible.

tzz avatar Feb 07 '18 17:02 tzz

@brian-brazil the request was for the user to be able to request a specific sort order for alerts and I thought #1234 was just the first step. Does closing this request mean it won't be done or that it will be implemented elsewhere?

tzz avatar Mar 28 '18 20:03 tzz

Ah, I'd missed that. I'm personally hoping we can avoid having to implement that.

brian-brazil avatar Mar 28 '18 20:03 brian-brazil

I'm curious about the way the API returns the result. is there any sorting logic?

aclowkey avatar Dec 20 '20 11:12 aclowkey

I have a suggestion: Couldn't you just order in the sequence of the group by list?

e.g. group_by: ['severity', 'alertname'] leads to first grouping by severity and then by alertnames

group_by: ['alertname','severity'] first groups by alertnames then by severity

OlafKocanda avatar Mar 04 '21 16:03 OlafKocanda

We have a use case for this where we're generating metrics per domain, writing alerts to detect various kinds of traffic anomalies, then grouping notifications "per-anomaly" to produce a list of domains with each type of anomaly. This can be dozens of items and we would prefer to get this list of domains sorted when formatting it.

My proposal would be e.g. for our case,

    group_sort_key_labels: [domain]

and if none is set, skipping the sort to keep whatever the current order would be. Notably, we don't need dynamic sorting, multiple sorts, or to do any kind of sorting in the template, we just want the alerts slice sorted in a particular order for the template.

If there's no resistance to this approach I would like to prepare a PR for it.

jwreschnig-utiq avatar Sep 22 '25 11:09 jwreschnig-utiq