thanos
thanos copied to clipboard
Provide detail on why a compact failed to download a block
Is your proposal related to a problem?
When compact.go fails to download a block the following error is displayed:
{"caller":"compact.go:570","err":"compaction: group 0@10000000000000000: download block 123456789ABCDEFG: context canceled","level":"error","msg":"retriable error","ts":"2025-06-06T03:06:06.14541332Z"}
which doesn't give you too much to go on.
Describe the solution you'd like
Add the reason why it failed to download the block. In this instance it was running out of space when downloading all the blocks needed to perform compaction.
Describe alternatives you've considered
Provision compactor with a large disk. The size would depend on the environment, and despite best sizing efforts, something could cause a spike in metrics which may impact the size of blocks that need to be compacted.
Additional context
The version this happens in is 0.38.0, specifically this line: https://github.com/thanos-io/thanos/blob/v0.38.0/cmd/thanos/compact.go#L570
Here are the log lines:
{"caller":"compact.go:1162","group":"0@{prometheus=\"monitoring/kube-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-0\"}","groupKey":"0@10505255943550810638","level":"info","msg":"compaction available and planned","plan":"[01JW0CKEECA8EK0FBCKYGDYMX1 (min time: 1747872000068, max time: 1748044800000) 01JW5HCK1FJ791VX78SP9YKMYM (min time: 1748044800081, max time: 1748217600000) 01JWAP6KG305Q95EKQ60KWMQWB (min time: 1748217600027, max time: 1748390400000) 01JWFTZS0226PQ3HN6QZ37X42W (min time: 1748390400111, max time: 1748563200000) 01JWMZS5QYF25ADQDCYNPHRVXK (min time: 1748563200024, max time: 1748736000000) 01JWT4JTWHSD68AM6BRS525ZFV (min time: 1748736000001, max time: 1748908800000) 01JWZ9CAPAFT6BC4T3PA0KSS7V (min time: 1748908800027, max time: 1749081600000)]","ts":"2025-06-06T02:59:28.013281259Z"}
{"caller":"compact.go:1171","duration":"27.12µs","duration_ms":0,"group":"0@{prometheus=\"monitoring/kube-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-0\"}","groupKey":"0@10505255943550810638","level":"info","msg":"finished running pre compaction callback; downloading blocks","plan":"[01JW0CKEECA8EK0FBCKYGDYMX1 (min time: 1747872000068, max time: 1748044800000) 01JW5HCK1FJ791VX78SP9YKMYM (min time: 1748044800081, max time: 1748217600000) 01JWAP6KG305Q95EKQ60KWMQWB (min time: 1748217600027, max time: 1748390400000) 01JWFTZS0226PQ3HN6QZ37X42W (min time: 1748390400111, max time: 1748563200000) 01JWMZS5QYF25ADQDCYNPHRVXK (min time: 1748563200024, max time: 1748736000000) 01JWT4JTWHSD68AM6BRS525ZFV (min time: 1748736000001, max time: 1748908800000) 01JWZ9CAPAFT6BC4T3PA0KSS7V (min time: 1748908800027, max time: 1749081600000)]","ts":"2025-06-06T02:59:28.01333855Z"}
{"cached":71,"caller":"fetcher.go:627","component":"block.BaseFetcher","duration":"89.636515ms","duration_ms":89,"level":"info","msg":"successfully synchronized block metadata","partial":0,"returned":71,"ts":"2025-06-06T03:00:28.05622541Z"}
{"caller":"compact.go:570","err":"compaction: group 0@10505255943550810638: download block 01JWT4JTWHSD68AM6BRS525ZFV: context canceled","level":"error","msg":"retriable error","ts":"2025-06-06T03:01:03.628641943Z"}