thanos icon indicating copy to clipboard operation
thanos copied to clipboard

Querier can deduplicate select call

Open yeya24 opened this issue 4 years ago • 7 comments

Is your proposal related to a problem?

In a complex promql query, a metric might be used multiple times in an expr. For example, https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/alerts/resource_alerts.libsonnet#L43-L62.

              sum(namespace_memory:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s})
                /
              sum(kube_node_status_allocatable{resource="memory"})
                >
              ((count(kube_node_status_allocatable{resource="memory"}) > 1) - 1)
                /
              count(kube_node_status_allocatable{resource="memory"})

For this query, thanos query will perform 1 time {name="namespace_memory:kube_pod_container_resource_requests:sum"} select and 3 times {name="kube_node_status_allocatable", resource="memory"} select.

You can find that this is totally unnecessary. It would be good to be able to find this case and only perform select once for {name="kube_node_status_allocatable", resource="memory"}.

Another common pattern is to calculate the error rate. For example, one query would be,

some_metrics{type="error"} / some_metrics

In this case, select will be executed twice but it is unnecessary as {__name__="some_metrics"} already covers the case of some_metrics{type="error"}

Last case, a promql might look like:

some_metrics{type="error"} / some_metrics{type="success"}

It is also possible to make the series query to be {__name__="some_metrics", type=~"error|success"} so that only one Select call is needed.

Describe the solution you'd like

I don't have any solution for this now. Thanos depends on the prometheus query engine so it is not easy to do something at Thanos sidecar in this case. Prometheus also has this kind of problem, but it is not a critical problem for them as it only performs queries on local TSDB.

Describe alternatives you've considered

Deal with this case in the query frontend? This might be a good idea and easier to change, but the query frontend needs to be aware of promql.

Additional context

Why this is important for us?

We use central queries with multiple sidecars at edge clusters. Each Select call would fanout to all edge clusters and network latency is really a big concern. It would be good to reduce unnecessary calls and I think the performance would be much better.

yeya24 avatar Jul 03 '21 23:07 yeya24

It can work the other way too: if someone is asking for a metric foo{} then we can reuse the results of that request for all queries that use the same metric + any kind of matchers e.g. foo{bar="baz"}

And this is very important not just for performance but also for reducing the operating costs of Thanos because one query might potentially lead to hundreds of requests

GiedriusS avatar Jul 04 '21 08:07 GiedriusS

@GiedriusS I am trying to add a cache to each querier instance. But the problem is that the select function is now concurrent so that we cannot wait for the select results to be cached and reused.

yeya24 avatar Jul 04 '21 18:07 yeya24

During community hours we talked about this and indeed there is opportunity to either:

  • Cache
  • Singleflight (easier) between series (even with different matches)

But also worth to mention that Query pushdown and singleflight on lower nodes (e.g store GW, sidecar) is also a viable option.

NOTE: What about singleflight on Store API gRPC level?

bwplotka avatar Jul 29 '21 17:07 bwplotka

(histogram_quantile(0.99, sum by(job, le) (rate(http_request_duration_seconds_bucket{handler="query",job=~".*thanos-query.*"}[5m]))) > 40 and sum by(job) (rate(http_request_duration_seconds_bucket{handler="query",job=~".*thanos-query.*"}[5m])) > 0)

Another example of duplicate select query.

yeya24 avatar Aug 31 '21 21:08 yeya24

https://github.com/thanos-io/thanos/pull/4290 has been running in prod here for the past couple of months but I suppose it is worth investing time looking into whether it is possible to pushdown this logic to the gRPC level.

GiedriusS avatar Jan 03 '22 09:01 GiedriusS

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] avatar Apr 17 '22 05:04 stale[bot]

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] avatar Sep 21 '22 06:09 stale[bot]

The new promql-engine does this exactly so I'm closing the ticket.

GiedriusS avatar Jan 23 '24 10:01 GiedriusS