thanos icon indicating copy to clipboard operation
thanos copied to clipboard

Query Stats Returned with query including query bytes fetched

Open christopherzli opened this issue 1 year ago • 5 comments
trafficstars

Is your proposal related to a problem?

Currently prometheus stats do not work

curl 'localhost:10902/api/v1/query?query=count(up)&stats=1'
{"status":"success","data":{"resultType":"vector","result":[{"metric":{},"value":[1715842901.578,"172"]}],"stats":{"timings":{"evalTotalTime":0,"resultSortTime":0,"queryPreparationTime":0,"innerEvalTime":0,"execQueueTime":0,"execTotalTime":0},"samples":{"totalQueryableSamples":0,"peakSamples":0}},"analysis":{}}}

Furthermore, we want to include query bytes fetched as a way to measure how expensive a query is when we query.

Describe the solution you'd like

Currently we found

  type SeriesStatsCounter struct {
	lastSeriesHash uint64

	Series  int
	Chunks  int
	Samples int
}

However, we found the stats counter is only used as metric, and i have not found a way on how it is returned with query. We plan to

  1. add # bytes into this struct
  2. find a way to export this stat as part of response or HTTP header in querier and query frontend

Describe alternatives you've considered

still exploring down the path

Additional context

cc @moadz @bwplotka @douglascamata who previously worked on query performance metrics (Write your answer here.)

christopherzli avatar May 16 '24 07:05 christopherzli

Please see https://github.com/thanos-io/promql-engine/pull/325/files and others for a previous attempt at this. We either want to have an improved Select() interface or pass these hints through context.Context somehow.

GiedriusS avatar May 16 '24 07:05 GiedriusS

Hi @GiedriusS thanks so much for the pointer, do you know why it is not merged into upstream?

christopherzli avatar May 16 '24 08:05 christopherzli

There are still some missing things: https://github.com/thanos-io/promql-engine/pull/325/files#diff-5b6024dfabe0bc120c9efaad778c55c7283d643c0ad2882eb7e47effa7aac384R24. Also, that PR needs some tests.

There was also a suggestion to just pass Select() metadata through context.Context. Perhaps it could be a simple map[string]any underneath with some getters/setters. Maybe it would be simpler to implement compared to that PR?

GiedriusS avatar May 16 '24 14:05 GiedriusS

Yep, you could consider passing a function into ctx so no need to disrupt the query engine itself, reference about how M3 does that:

https://github.com/databricks/m3/blob/db_main/src/query/api/v1/handler/prom/read.go#L270-L278

and

https://github.com/databricks/m3/blob/db_main/src/query/api/v1/handler/prometheus/handleroptions/headers.go#L121-L123

jnyi avatar May 16 '24 18:05 jnyi

published a PR here: https://github.com/thanos-io/thanos/pull/7390

christopherzli avatar May 24 '24 19:05 christopherzli