possible memory leak in structured metadata
Describe the bug I'm trying to push log records into loki via push JSON API, if log records contains structured metadata, loki memory usage increases over time with no limit, i played with some configs like using s3 backend, or changing cache settings with no success.
To Reproduce
- Started Loki with this config:
auth_enabled: false
server:
http_listen_port: 3100
grpc_server_max_recv_msg_size: 104857600
grpc_server_max_send_msg_size: 104857600
common:
path_prefix: /loki
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: s3
schema: v13
index:
prefix: index_
period: 24h
storage_config:
tsdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/index_cache
resync_interval: 5s
shared_store: s3
aws:
s3: s3://admin:supersecret@minio.:9000/loki
s3forcepathstyle: true
query_scheduler:
max_outstanding_requests_per_tenant: 32768
querier:
max_concurrent: 16
limits_config:
ingestion_rate_mb: 100
ingestion_burst_size_mb: 100
reject_old_samples: false
allow_structured_metadata: true
ruler:
alertmanager_url: http://localhost:9093
analytics:
reporting_enabled: true
- Started to send log batches to push api '...'
Expected behavior the ingestions continues with proper memory consumption.
Environment:
- Infrastructure: docker container on debian host
- Deployment tool: docker compose
I have the same problem in loki 2.9.1 using minio as backend, tsdb but nothing related to metadata.
I use almost the default configuration from the chart: https://gitlab.com/davinkevin.fr/home-server/-/blob/3834527c8db3b2e9c864419ea65362f508dbf06d/monitoring/overlays/k8s-server/loki/loki.values.yaml
[!NOTE] Tested with
loki2.9.2, same problem. it consumed 20Gb, killing it doesn't solve it…
do you see any errors in your logs? wondering if it isn't that ingesters are failing to flush data
Chicken and egg problem, I had to move back to boltdb database and so I have no logs anymore 😓.
you can't just "change" your database type like that. you should instead add a new config just for boltdb and set the start period as today
you can't just "change" your database type like that. you should instead add a new config just for boltdb and set the start period as today
I know and I did… and the system started to consume all CPU and Memory again 😅
same problem with loki 2.9.2 When I set up promtail, the memory of the Loki write component started to soar. In fact, I had already set the scheme version to v13 two days ago.
Is there any new info? I want to migrate to schema v13 & use structured metadata. I would also like to migrate to TSDB. If there is a memory leak it would be a problem for me.
No progress from me.
With boltdb being deprecated and the new loki v3, I'll give it a new try but my setup hasn't changed.
We see a similar issue in Loki 3 too. I have attached a heap pprof SVG demonstrating the issue.
Same as https://github.com/grafana/loki/issues/13123, which is now fixed?