ktnvaish22
ktnvaish22
Hi @rapphil Does alertmanager retry only 5xx errors? Is there a documentation present on retry mechanism of Alertmanager?
Hey @grobinson-grafana!! After enabling debug logs, it took few trials to reproduce the issue. Quite inconsistent behaviour. Just one update, I have increased Alertmanager replica count to 3, so I...
Thanks @grobinson-grafana for helpful explanation of the root cause. Can we overcome this issue by configuring the evaluation interval (`interval`) in alert rules as `5m` while `group_wait` in alertmanager as...
Hey @grobinson-grafana!! I am trying to debug the logs where 2 pairs (fired and resolved) of duplicates were notified. I have a few doubts: 1. Does an Alertmanager instance create...
Hey @grobinson-grafana!! I was debugging the logs to point out the race condition on a fresh set of logs (attached). [Alertmanager_logs.zip](https://github.com/user-attachments/files/16962809/Alertmanager_logs.zip) I observed there's more than just a race between...
> Yes, missing "Received alerts" can cause issues, as alerts must be sent to all Alertmanagers in the cluster to make sure the state of each alert is consistent across...
Hey @grobinson-grafana!! We changed VM Alert config and made sure that each replica of Alertmanager gets alert from upstream. We tried replicating the issue and captured the logs for analysis....
> I'm not sure. I thought it could be due to the issue mentioned in https://github.com/prometheus/alertmanager/pull/3419, but I don't see a received alert or reload just before the flush. Sure,...
Hey @grobinson-grafana! We have changed how we deploy our Alertmanagers. Instead of one deployment with multiple replicas, now we are using 3 separate deployments with 1 replica each. We have...
Hi there, just checking in to see if there are any updates, fixes or workarounds regarding this issue. :)