thanos
thanos copied to clipboard
sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0
Thanos, Prometheus and Golang version used:
thanos, version 0.32.3 (branch: HEAD, revision: 3d98d7ce7a254b893e4c8ee8122f7f6edd3174bd)
build user: root@0b3c549e9dae
build date: 20230920-07:27:32
go version: go1.20.8
platform: linux/amd64
tags: netgo
Object Storage Provider:
AWS S3
What happened:
After upgrading from 0.31.0 to 0.35.0 we saw greatly increased sidecar memory usage and narrowed it down to a change between 0.32.2 and 0.32.3 (the Prometheus update maybe?).
The memory usage shoots up for certain queries, for us likely recording rules by the ruler, thus constantly high usage was observed.
What you expected to happen:
No significant change in memory usage.
How to reproduce it (as minimally and precisely as possible):
Run {job=".+"}
on Prometheus with some metrics for either version and compare memory usage.
Full logs to relevant components:
Anything else we need to know:
Heap profiles for 0.32.2 and 0.32.3 with the same query on the same Prometheus node: