Long run of slo and slo versioning
From time to time we need to adjust queries or labels for slo. as for now changing labels will break slo calculation for changed.
I propose to add label (?) slo_version to rules calculation and that label to the calculations of
- slo:current_burn_rate:ratio
- slo:period_burn_rate:ratio
- slo:period_error_budget_remaining:ratio
- alerts that would be enough to create different version of slo that can be tracked independently
I worked around this limitation by simply having the revision in the SLO name.
For example
slos:
name: http-availability-rev-1
description: 99.5% of the requests should be successful for the user-facing API
...
becomes
slos:
name: http-availability-rev-2
description: 99.5% of the requests should be successful for the user-facing API
...
which results in a new SLO, while being able to track the previous one - the same behavior you'd get with having the slo_version label.
at the end of the day we got the same idea but we've introduced Version filed in spec. something like this. a.txt
i believe this works should be done in different way with introduction of prometheus/v2 kind. and we are not sure if it must go to upstream because of this.