thanos icon indicating copy to clipboard operation
thanos copied to clipboard

Receive: compaction failure causing query returning irrelevant time series

Open jnyi opened this issue 1 year ago • 1 comments

Hi Team,

We've been enabled OOO for a few months, and started to see a weird behavior that when 1 or 2 pods have compaction failures, the query returns irrelevant data, appreciate any insights:

Screenshot 2024-06-24 at 1 19 18 PM

This caused false positive firing alerts in our setup with other baselines:

Screenshot 2024-06-24 at 1 21 08 PM

Thanos, Prometheus and Golang version used:

Thanos: v0.35.1 Go: go1.21.11

Object Storage Provider:

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components:

Anything else we need to know:

jnyi avatar Jun 24 '24 20:06 jnyi

Where did the compaction failure happen. Did you mean Compactor or Receiver?

yeya24 avatar Jul 01 '24 05:07 yeya24

It is receiver, see more discussions here: https://cloud-native.slack.com/archives/CK5RSSC10/p1719260674748149

jnyi avatar Jul 01 '24 17:07 jnyi