loki icon indicating copy to clipboard operation
loki copied to clipboard

possible memory leak in structured metadata

Open shimasima2323 opened this issue 2 years ago • 7 comments

Describe the bug I'm trying to push log records into loki via push JSON API, if log records contains structured metadata, loki memory usage increases over time with no limit, i played with some configs like using s3 backend, or changing cache settings with no success.

To Reproduce

  1. Started Loki with this config:
auth_enabled: false

server:
  http_listen_port: 3100
  grpc_server_max_recv_msg_size: 104857600
  grpc_server_max_send_msg_size: 104857600

common:
  path_prefix: /loki
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h

storage_config:
  tsdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    resync_interval: 5s
    shared_store: s3
  aws:
   s3: s3://admin:supersecret@minio.:9000/loki
   s3forcepathstyle: true

query_scheduler:
  max_outstanding_requests_per_tenant: 32768

querier:
  max_concurrent: 16

limits_config:
  ingestion_rate_mb: 100
  ingestion_burst_size_mb: 100
  reject_old_samples: false
  allow_structured_metadata: true

ruler:
  alertmanager_url: http://localhost:9093

analytics:
 reporting_enabled: true

  1. Started to send log batches to push api '...'

Expected behavior the ingestions continues with proper memory consumption.

Environment:

  • Infrastructure: docker container on debian host
  • Deployment tool: docker compose

shimasima2323 avatar Oct 22 '23 12:10 shimasima2323

I have the same problem in loki 2.9.1 using minio as backend, tsdb but nothing related to metadata.

I use almost the default configuration from the chart: https://gitlab.com/davinkevin.fr/home-server/-/blob/3834527c8db3b2e9c864419ea65362f508dbf06d/monitoring/overlays/k8s-server/loki/loki.values.yaml

image

[!NOTE] Tested with loki 2.9.2, same problem. it consumed 20Gb, killing it doesn't solve it…

davinkevin avatar Oct 28 '23 08:10 davinkevin

do you see any errors in your logs? wondering if it isn't that ingesters are failing to flush data

DylanGuedes avatar Oct 30 '23 21:10 DylanGuedes

Chicken and egg problem, I had to move back to boltdb database and so I have no logs anymore 😓.

davinkevin avatar Oct 31 '23 03:10 davinkevin

you can't just "change" your database type like that. you should instead add a new config just for boltdb and set the start period as today

DylanGuedes avatar Oct 31 '23 09:10 DylanGuedes

you can't just "change" your database type like that. you should instead add a new config just for boltdb and set the start period as today

I know and I did… and the system started to consume all CPU and Memory again 😅

davinkevin avatar Oct 31 '23 10:10 davinkevin

same problem with loki 2.9.2 When I set up promtail, the memory of the Loki write component started to soar. In fact, I had already set the scheme version to v13 two days ago.

image

Godzillas avatar Nov 19 '23 14:11 Godzillas

Is there any new info? I want to migrate to schema v13 & use structured metadata. I would also like to migrate to TSDB. If there is a memory leak it would be a problem for me.

elcomtik avatar Feb 22 '24 12:02 elcomtik

No progress from me. With boltdb being deprecated and the new loki v3, I'll give it a new try but my setup hasn't changed.

davinkevin avatar Apr 13 '24 09:04 davinkevin

We see a similar issue in Loki 3 too. I have attached a heap pprof SVG demonstrating the issue.

leak

JohanLindvall avatar Jun 03 '24 13:06 JohanLindvall

Same as https://github.com/grafana/loki/issues/13123, which is now fixed?

JohanLindvall avatar Jun 07 '24 07:06 JohanLindvall