thanos
thanos copied to clipboard
Querier can deduplicate select call
Is your proposal related to a problem?
In a complex promql query, a metric might be used multiple times in an expr. For example, https://github.com/kubernetes-monitoring/kubernetes-mixin/blob/master/alerts/resource_alerts.libsonnet#L43-L62.
sum(namespace_memory:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s})
/
sum(kube_node_status_allocatable{resource="memory"})
>
((count(kube_node_status_allocatable{resource="memory"}) > 1) - 1)
/
count(kube_node_status_allocatable{resource="memory"})
For this query, thanos query will perform 1 time {name="namespace_memory:kube_pod_container_resource_requests:sum"} select and 3 times {name="kube_node_status_allocatable", resource="memory"} select.
You can find that this is totally unnecessary. It would be good to be able to find this case and only perform select once for {name="kube_node_status_allocatable", resource="memory"}.
Another common pattern is to calculate the error rate. For example, one query would be,
some_metrics{type="error"} / some_metrics
In this case, select will be executed twice but it is unnecessary as {__name__="some_metrics"} already covers the case of some_metrics{type="error"}
Last case, a promql might look like:
some_metrics{type="error"} / some_metrics{type="success"}
It is also possible to make the series query to be {__name__="some_metrics", type=~"error|success"} so that only one Select call is needed.
Describe the solution you'd like
I don't have any solution for this now. Thanos depends on the prometheus query engine so it is not easy to do something at Thanos sidecar in this case. Prometheus also has this kind of problem, but it is not a critical problem for them as it only performs queries on local TSDB.
Describe alternatives you've considered
Deal with this case in the query frontend? This might be a good idea and easier to change, but the query frontend needs to be aware of promql.
Additional context
Why this is important for us?
We use central queries with multiple sidecars at edge clusters. Each Select call would fanout to all edge clusters and network latency is really a big concern. It would be good to reduce unnecessary calls and I think the performance would be much better.
It can work the other way too: if someone is asking for a metric foo{} then we can reuse the results of that request for all queries that use the same metric + any kind of matchers e.g. foo{bar="baz"}
And this is very important not just for performance but also for reducing the operating costs of Thanos because one query might potentially lead to hundreds of requests
@GiedriusS I am trying to add a cache to each querier instance. But the problem is that the select function is now concurrent so that we cannot wait for the select results to be cached and reused.
During community hours we talked about this and indeed there is opportunity to either:
- Cache
- Singleflight (easier) between series (even with different matches)
But also worth to mention that Query pushdown and singleflight on lower nodes (e.g store GW, sidecar) is also a viable option.
NOTE: What about singleflight on Store API gRPC level?
(histogram_quantile(0.99, sum by(job, le) (rate(http_request_duration_seconds_bucket{handler="query",job=~".*thanos-query.*"}[5m]))) > 40 and sum by(job) (rate(http_request_duration_seconds_bucket{handler="query",job=~".*thanos-query.*"}[5m])) > 0)
Another example of duplicate select query.
https://github.com/thanos-io/thanos/pull/4290 has been running in prod here for the past couple of months but I suppose it is worth investing time looking into whether it is possible to pushdown this logic to the gRPC level.
Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
The new promql-engine does this exactly so I'm closing the ticket.