Alert limitation discussion: Maximum number of alerts to display in UI
We currently show all possible alerts firing. There is a limit at which we can reasonably load and display all these alerts, but we're not sure what that value is.
From a bandwidth perspective: From @beorn7
And one other aspect: If I'm on-call on the road, I sometimes need to work with low-bandwidth (e.g. shared crappy WiFi) and/or very expensive connection to the internet (e.g. tethering my phone). Even if thousands of alerts load fine with a broadband connection, and the browser on my 16GiB laptop is fine handling that, it might be very inconvenient in a situation like the above, where the first thing I do might be to look at the AM landing page (perhaps even on the browser of my smallish mobile phone).
Although the links from an alert notification include filters to reduce the number of results shown, a mass outage could still affect many instances, which might greatly slow the ability to load.
I personally think if you're on call you should probably stay somewhere that has access to fast, reliable internet, so I'm not proposing any solution to this at the moment, just documenting.
I think hitting an AM link with an filter already active is probably fine. If my own area throws that many alerts, I have enough trouble anyway and might really need bigger tubes.
However, if I just type http://alertmanager in my browser to get on overview of things (or simply use the convenient UI for creating a filter), I don't want to kill my network connection because thousands of alerts not relevant to my work will have to be loaded into my browser.
Have we thought about pagination?
Are there any plans to include some sort of pagination and/or limiting to the results set returned by the alerts/groups API? We want to make our own front-end solution to display alerts, however, or rules may generate tens of thousands of alerts, and through the API, this may take exceptionally long to load, and may even timeout client-side.
Given that the alertmanager is made for company-wide scale, it might make sense to have some kind of pagination or limit in the API. When things are bad you want to be able to reach the front page of AM.
Given that the alertmanager is made for company-wide scale, it might make sense to have some kind of pagination or limit in the API. When things are bad you want to be able to reach the front page of AM.
Indeed, we've already had instances where the front-page would take minutes to load, and even cause the browsers to detect the tabs/windows as unresponsive.
My team at AWS started to work on this; we've gotten many user feedbacks that requires pagination for alert listing APIs. We will submit a API design proposal in coming weeks.