Pyroscope server crashloop OOM when using backend S3
Describe the bug
Pyroscope server crashloop OOM when using backend S3
On production I use Pyroscope with 6 java agent with heavy loading. Without backend S3 conf all run perfects. When backend S3 is in use, Pyroscope pod crash in the minute. I see mem limit of 2000Mi being hit.
Note when I reset S3 content Pyroscope Pod is happy and no crash happens. I try to watch in how many days OOM will come again.
Is it possible that Pyroscope enter in a never loop ending while loading/sync S3 objects ?
Environment
- Infrastructure: k8s AWS
- Deployment tool: helm with last img grafana/pyroscope:1.15.0
Hi @maxospiquante,
To discard any unwanted behaviour introduced in the latest release, is it reproducible with previous versions of the helm chart? Also, could you share your values file?
It might be that blocks are too big. I'd give a try by lowering block duration to something around 10m.
this is last conf I try
config.yaml: |
compactor:
compaction_concurrency: 1
block_sync_concurrency: 4
meta_sync_concurrency: 10
pyroscopedb:
max_block_duration: 10m
row_group_target_size: 268435456
limits:
compactor_blocks_retention_period: 2d
max_query_length: 12h
max_local_series_per_tenant: 50000
max_global_series_per_tenant: 5000
storage:
backend: s3
s3:
bucket_name: xxxx-pyroscope
endpoint: s3.us-east-1.amazonaws.com
signature_version: v4
bucket_store:
store_gateway:
tenant_sync_concurrency: 5
meta_sync_concurrency: 10
Note on prod I have more 60 clients using java agent.
My unique pod pyroscope-0 was crash looping during days , restarting each 5 min, following the prom schema you see here :
Just for the try, note when I run pyroscope without the compactor component, there is no crash, no mem issue.
Then I used to set compactor_blocks_retention_period: 1d and now mem usage is so low. Then I re set to compactor_blocks_retention_period: 2d even 3d and it stays quiet one week later:
It let me thinking reducing the retention period clean some data that were putting pyroscope in trouble ? Is there a deep debug I can look at somewhere ?