thanos
thanos copied to clipboard
Query Stats Returned with query including query bytes fetched
Is your proposal related to a problem?
Currently prometheus stats do not work
curl 'localhost:10902/api/v1/query?query=count(up)&stats=1'
{"status":"success","data":{"resultType":"vector","result":[{"metric":{},"value":[1715842901.578,"172"]}],"stats":{"timings":{"evalTotalTime":0,"resultSortTime":0,"queryPreparationTime":0,"innerEvalTime":0,"execQueueTime":0,"execTotalTime":0},"samples":{"totalQueryableSamples":0,"peakSamples":0}},"analysis":{}}}
Furthermore, we want to include query bytes fetched as a way to measure how expensive a query is when we query.
Describe the solution you'd like
Currently we found
type SeriesStatsCounter struct {
lastSeriesHash uint64
Series int
Chunks int
Samples int
}
However, we found the stats counter is only used as metric, and i have not found a way on how it is returned with query. We plan to
- add # bytes into this struct
- find a way to export this stat as part of response or HTTP header in querier and query frontend
Describe alternatives you've considered
still exploring down the path
Additional context
cc @moadz @bwplotka @douglascamata who previously worked on query performance metrics (Write your answer here.)
Please see https://github.com/thanos-io/promql-engine/pull/325/files and others for a previous attempt at this. We either want to have an improved Select() interface or pass these hints through context.Context somehow.
Hi @GiedriusS thanks so much for the pointer, do you know why it is not merged into upstream?
There are still some missing things: https://github.com/thanos-io/promql-engine/pull/325/files#diff-5b6024dfabe0bc120c9efaad778c55c7283d643c0ad2882eb7e47effa7aac384R24. Also, that PR needs some tests.
There was also a suggestion to just pass Select() metadata through context.Context. Perhaps it could be a simple map[string]any underneath with some getters/setters. Maybe it would be simpler to implement compared to that PR?
Yep, you could consider passing a function into ctx so no need to disrupt the query engine itself, reference about how M3 does that:
https://github.com/databricks/m3/blob/db_main/src/query/api/v1/handler/prom/read.go#L270-L278
and
https://github.com/databricks/m3/blob/db_main/src/query/api/v1/handler/prometheus/handleroptions/headers.go#L121-L123
published a PR here: https://github.com/thanos-io/thanos/pull/7390