pyrra
pyrra copied to clipboard
Gauge metrics suport by Pyrra
At the moment, Pyrra suports only metrics-counters. It would be great if possible to have support for gauge metrics as well. For example, blackbox exporter "probe_success" metric returns 1 (success) or 0 (fail). Is it possible to add formulas to Pyrra to create SLO graph based on such data? Perhaps this can be achieved using the "count_over_time" function, and the number of successful attempts - with "sum_over_time"?
Thank you)
Generally speaking we could and should totally support these time based SLOs as well!
I'm not 100% how the alerting with multi error burn rates would look like. If we can figure out the PromQL behind these then I'd be totally down for adding this to Pyrra for sure!
I agree with @rkostyantyn . It would be very useful to support the gauge metrics. I'm not able to track our metrics generated by the blackbox exporter too.
I've tried implementing an SLO(not using Pyrra) with blackbox-exporter using sum_over_time(probe_success{job="probe"}[5m]) / count_over_time(up{job="probe"}[5m])
, but the experience still wasn't great (for my case at least).
My case:
I have multiple VMs that I need to guarantee network connectivity, 1 blackbox-exporter is installed per VM, probing one single enpoint.
sum_over_time(probe_success{job="probe"}[5m]) / count_over_time(up{job="probe"}[5m])
looks good for a single VM, but when rolling out to multiple VM it doesn't work anymore. The reason is because sum_over_time
and count_over_time
don't aggregate data, so we get multiple timeseries from this query.
Then I tried using sum(probe_success{job="probe"} == 1) / count(up{job="probe"})
and now the query does look good on a graph, but it is tricky to build the recording rules to get Error Budget burns with this one 🤔
Implemented and closed by @roidelapluie in #598
@metalmatze, will this be included in the next release? v0.5.6 is missing this support.