sloth icon indicating copy to clipboard operation
sloth copied to clipboard

Some queries not working with Thanos hosted data

Open poochwashere opened this issue 3 years ago • 2 comments

Thank you for developing this really useful project.

We have our metrics migrate over to Thanos for long term storage after living in Prometheus for 24 hours. For some reason the queries for the remaining error budget are not getting properly calculated and returns a NaN. When I change the data source from Thanos to Prom things work as expected.

I can get values from some of the prom queries when I run them individually but when I run the entire expression it chokes.

These work ad-hoc against the Thanos datasource and returns the expected value. slo:sli_error:ratio_rate1h{sloth_service="mfplaid-api",sloth_slo="requests-availability"} slo:error_budget:ratio{sloth_service="mfplaid-api",sloth_slo="requests-availability"} *on() group_left() (24 * days_in_month())

But when I execute the entire expression it returns NaN.

1-(
  sum_over_time(
    (
       slo:sli_error:ratio_rate1h{sloth_service="mfplaid-api",sloth_slo="requests-availability"}
       * on() group_left() (
         month() == bool vector(8)
       )
    )[32d:1h]
  )
  / on(sloth_id)
  (
    slo:error_budget:ratio{sloth_service="mfplaid-api",sloth_slo="requests-availability"} *on() group_left() (24 * days_in_month())
  )
)

Any clues that may help me?

Thanks Again!

poochwashere avatar Aug 04 '22 00:08 poochwashere

Hello! @poochwashere Have you been able to solve this problem?

neitrinoweb avatar May 11 '23 09:05 neitrinoweb

How are you defining the PrometheusServiceLevel ? Do you have any example? Thanks

alexvaque avatar Feb 28 '24 21:02 alexvaque