Anand Rajagopal
Anand Rajagopal
### What does this PR do? This PR is a fix for #3682. In some instances, `mem.Alerts.Subscribe()` and `store.gc()` can get deadlocked 1. Lock acquisition in `store.Alerts.gc()`: - The method...
Ruler ha
**What this PR does**: **Which issue(s) this PR fixes**: Fixes # **Checklist** - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated - the order...
**What this PR does**: Proposal to introduce SyncNamespace API to enable on-demand synchronization of rule group namespace **Which issue(s) this PR fixes**: N/A **Checklist** - [ ] Tests updated -...
### Proposal Currently once a Prometheus instance loads a rule group, it evaluates it continuously. If the instance evaluating the rule group becomes unavailable, there is a high chance for...
This PR is for https://github.com/prometheus/prometheus/issues/13630
**Which issue(s) this PR fixes**: Fixes #5989 **Checklist** - [x] Tests updated - [ ] Documentation added - [x] `CHANGELOG.md` updated - the order of entries should be `[CHANGE]`, `[FEATURE]`,...
**What this PR does**: **Which issue(s) this PR fixes**: Proposal for Ruler HA I can add more details as necessary **Checklist** - [ ] Tests updated - [ ] Documentation...
**What this PR does**: Currently once a Ruler instance loads a rule group, it evaluates it continuously. If the instance evaluating the rule group becomes unavailable, there is a high...
Currently silence [metric collection](https://github.com/prometheus/alertmanager/blob/main/silence/silence.go#L234-L249) happens during scrape time. In scenarios where AlertManager is under heavy load, lock contention can occur and causes high latency in scraping. One such scenario is...