tempo icon indicating copy to clipboard operation
tempo copied to clipboard

quantile_over_time with multiple quantiles returns copied exemplars for all series

Open ruslan-mikhailov opened this issue 6 months ago • 3 comments

Describe the bug In this loop: https://github.com/ruslan-mikhailov/tempo/blob/17e20a43af0881ef0841d2e33eaeb190c422e549/pkg/traceql/engine_metrics.go#L1477-L1480 each quantile receives a copy of exemplars here. It leads to bad UX. For example, each dot in the plot is an exemplar for each of the quantiles.

Image

In particular, q50 has exemplars that are much higher than time series data:

Image

Because of this behaviour, it also ignores exemplars query parameter.

To Reproduce Steps to reproduce the behavior:

  1. Run Docker Compose from example/docker-compose/local
  2. Wait some time (5+ minutes) until k6 pushes enough traces
  3. Query {} | quantile_over_time(span:duration, 0.5, 0.9, 0.99)

Expected behavior

I would expect exemplar values from each series to be less than the quantile's value.

Environment: Can be reproduced in:

  • Infrastructure: docker compose
  • Deployment tool: [e.g., helm, jsonnet]

Additional Context

ruslan-mikhailov avatar May 26 '25 14:05 ruslan-mikhailov

I think the ideal fix is to choose the exemplars from the internal bucket that the final quantile was located in, inside Log2Quantile. I.e. if the intermediate histogram buckets are le=0.5s, 1s, 2s, and the p95 was calculated to be 1.75s, then that series should use exemplars found in the le=2s bucket.

mdisibio avatar May 29 '25 11:05 mdisibio

I took a look at this last week a bit. I started a branch but not a PR yet. I was struggling to locate the exemplar with the correct quantile a bit, but after thinking about it over the weekend I'd like to give it another go.

zalegrala avatar Jun 11 '25 14:06 zalegrala

@ruslan-mikhailov feel free to reprioritize this as needed.

alexbikfalvi avatar Jun 11 '25 15:06 alexbikfalvi

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.

github-actions[bot] avatar Aug 13 '25 00:08 github-actions[bot]