pyrra
pyrra copied to clipboard
0s burnrate generated
When creating a tight SLO on a shorter window it appears a 0s burn rate gets created causing errors
pyrra rules:
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
name: email-service-calendar
namespace: monitoring
labels:
prometheus: k8s
role: alert-rules
spec:
target: "99.999"
window: "12h"
indicator:
ratio:
errors:
metric: http_responses_total{host="email-service-mylu2.okd.liberty.edu", path="/microsoft/calendar/", response=~"5.."}
total:
metric: http_responses_total{host="email-service-mylu2.okd.liberty.edu", path="/microsoft/calendar/"}
Prometheus logs:
ts=2024-01-12T13:55:51.589Z caller=manager.go:1049 level=error component="rule manager" msg="loading groups failed" err="/etc/prometheus/pyrra/prometheus-http.yaml: 23:11: group "email-service-calendar", rule 1, "http_responses:burnrate0s": could not parse expression: 1:119: parse error: duration must be greater than 0"
pyrra version: 7.2
Interesting. I never anticipated people actually want SLO windows this small. Usually at least a couple of days.
Are you sure you want an SLO in your case for the alerting?
Even if you want such a small window you would also have to scrape your metrics super fast. Like scrape every second instead of the usual 15s and more.
It would be great to learn more about your use case.
Interesting. I never anticipated people actually want SLO windows this small. Usually at least a couple of days.
Are you sure you want an SLO in your case for the alerting?
We were using new metrics that had just started being generated as a PoC. Long term 2w or more was fine, but we were using it to test pyrra
In that case, whether it's 2w or 12h shouldn't matter. Let's make sure we at least have 1s. Whether that's more helpful is debatable; it's definitely less broken.