test-infra icon indicating copy to clipboard operation
test-infra copied to clipboard

Add alerting rules and post alerts as GitHub comments on being triggered

Open sipian opened this issue 7 years ago • 3 comments

Since AlertManager doesn't natively support a GitHub receiver, a receiver needs to be made using webhook-receiver

alertmanager-github-receiver creates a GitHub issue from alerts using AlertManager webhook-receiver. Maybe this tool can modified to support GitHub comments also.

sipian avatar Jul 04 '18 17:07 sipian

that is very useful and it look like exactly what we need. Thanks will look into that!

krasi-georgiev avatar Jul 04 '18 22:07 krasi-georgiev

It makes sense to terminate benchmarking in the middle, rather than wait for 6 hours for a buggy release/PR. The following can be certain scenarios to declare benchmarking to have failed

  • Prometheus’s resource usage crosses a certain threshold. (similar to adding alerting rules for prometheus servers)
  • A known good prometheus release will be considered. If the difference in metrics of the current release and a known good release is beyond a certain threshold. For example:
    • (no. of failed queries) - (original no. of failed queries) > threshold
    • (avg. memory usage) - (original avg. memory usage) > threshold
    • (used disk space) - (original used disk space) > threshold

krasi-georgiev avatar Jul 25 '18 14:07 krasi-georgiev

a bug that we should have an alerting rule for

https://github.com/prometheus/prometheus/issues/3943

krasi-georgiev avatar Aug 14 '18 09:08 krasi-georgiev