thanos icon indicating copy to clipboard operation
thanos copied to clipboard

Provide detail on why a compact failed to download a block

Open peter-edb opened this issue 9 months ago • 0 comments

Is your proposal related to a problem?

When compact.go fails to download a block the following error is displayed:

{"caller":"compact.go:570","err":"compaction: group 0@10000000000000000: download block 123456789ABCDEFG: context canceled","level":"error","msg":"retriable error","ts":"2025-06-06T03:06:06.14541332Z"}

which doesn't give you too much to go on.

Describe the solution you'd like

Add the reason why it failed to download the block. In this instance it was running out of space when downloading all the blocks needed to perform compaction.

Describe alternatives you've considered

Provision compactor with a large disk. The size would depend on the environment, and despite best sizing efforts, something could cause a spike in metrics which may impact the size of blocks that need to be compacted.

Additional context

The version this happens in is 0.38.0, specifically this line: https://github.com/thanos-io/thanos/blob/v0.38.0/cmd/thanos/compact.go#L570

Here are the log lines:

{"caller":"compact.go:1162","group":"0@{prometheus=\"monitoring/kube-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-0\"}","groupKey":"0@10505255943550810638","level":"info","msg":"compaction available and planned","plan":"[01JW0CKEECA8EK0FBCKYGDYMX1 (min time: 1747872000068, max time: 1748044800000) 01JW5HCK1FJ791VX78SP9YKMYM (min time: 1748044800081, max time: 1748217600000) 01JWAP6KG305Q95EKQ60KWMQWB (min time: 1748217600027, max time: 1748390400000) 01JWFTZS0226PQ3HN6QZ37X42W (min time: 1748390400111, max time: 1748563200000) 01JWMZS5QYF25ADQDCYNPHRVXK (min time: 1748563200024, max time: 1748736000000) 01JWT4JTWHSD68AM6BRS525ZFV (min time: 1748736000001, max time: 1748908800000) 01JWZ9CAPAFT6BC4T3PA0KSS7V (min time: 1748908800027, max time: 1749081600000)]","ts":"2025-06-06T02:59:28.013281259Z"}
{"caller":"compact.go:1171","duration":"27.12µs","duration_ms":0,"group":"0@{prometheus=\"monitoring/kube-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-0\"}","groupKey":"0@10505255943550810638","level":"info","msg":"finished running pre compaction callback; downloading blocks","plan":"[01JW0CKEECA8EK0FBCKYGDYMX1 (min time: 1747872000068, max time: 1748044800000) 01JW5HCK1FJ791VX78SP9YKMYM (min time: 1748044800081, max time: 1748217600000) 01JWAP6KG305Q95EKQ60KWMQWB (min time: 1748217600027, max time: 1748390400000) 01JWFTZS0226PQ3HN6QZ37X42W (min time: 1748390400111, max time: 1748563200000) 01JWMZS5QYF25ADQDCYNPHRVXK (min time: 1748563200024, max time: 1748736000000) 01JWT4JTWHSD68AM6BRS525ZFV (min time: 1748736000001, max time: 1748908800000) 01JWZ9CAPAFT6BC4T3PA0KSS7V (min time: 1748908800027, max time: 1749081600000)]","ts":"2025-06-06T02:59:28.01333855Z"}
{"cached":71,"caller":"fetcher.go:627","component":"block.BaseFetcher","duration":"89.636515ms","duration_ms":89,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":71,"ts":"2025-06-06T03:00:28.05622541Z"}
{"caller":"compact.go:570","err":"compaction: group 0@10505255943550810638: download block 01JWT4JTWHSD68AM6BRS525ZFV: context canceled","level":"error","msg":"retriable error","ts":"2025-06-06T03:01:03.628641943Z"}

peter-edb avatar Jun 06 '25 07:06 peter-edb