nats-server
nats-server copied to clipboard
JetStream memory leak?
What version were you using?
2.10.2
What environment was the server running in?
Kubernetes, default configuration from the NATS helm chart, updated to use 2.10.2-alpine
with cluster (3 replicas) and jetstream enabled.
GOMEMLIMIT
is 900MiB
, and pods request 1Gi
of memory and are limited to 3Gi
.
Is this defect reproducible?
Yes, happens again after restarting the NATS servers
Given the capability you are leveraging, describe your expectation?
Consistent low memory usage
Given the expectation, what is the defect you are observing?
Since the upgrade from 2.9.22
to 2.10.2
we have to regularly restart the NATS servers to avoid them running out of memory.
A single stream is used, with some topics shared between all consumers, and some dedicated for single consumers. Messages are normally between 20 bytes and 3kb. Messages are consumed and immediately ack'ed. Consumers are never slow, and there is rarely more than 100kb of data in the queue. Nearly all messages go through JetStream (99.9999%). The number of consumers does not change. JetStream disk usage is < 10mb.
Even nats-2 is affected by the increased memory usage, it never had any consumers.
The memory consistently builds up over time.
Last restart was 13.5 hours ago, info below was captured now.
Stream info:
Information for Stream X created 2023-10-10 01:10:57
Subjects: A, B, C, D, E.*, F.*
Replicas: 3
Storage: File
Options:
Retention: Interest
Acknowledgments: true
Discard Policy: Old
Duplicate Window: 2m0s
Allows Msg Delete: true
Allows Purge: true
Allows Rollups: false
Limits:
Maximum Messages: unlimited
Maximum Per Subject: unlimited
Maximum Bytes: unlimited
Maximum Age: 11m0s
Maximum Message Size: unlimited
Maximum Consumers: unlimited
Cluster Information:
Name: nats
Leader: nats-0
Replica: nats-1, current, seen 14ms ago
Replica: nats-2, current, seen 14ms ago
State:
Messages: 0
Bytes: 0 B
First Sequence: 421,711,295
Last Sequence: 421,711,294 @ 2023-10-11 09:51:02 UTC
Active Consumers: 5
Consumer example (settings are the same for all, but topics vary):
Information for Consumer X > Z created 2023-10-10T01:11:01Z
Configuration:
Durable Name: Z
Pull Mode: true
Filter Subjects: A, E.Z, F.Y, D
Deliver Policy: New
Ack Policy: Explicit
Ack Wait: 30.00s
Replay Policy: Instant
Max Ack Pending: 1,000
Max Waiting Pulls: 512
Inactive Threshold: 10m0s
Cluster Information:
Name: nats
Leader: nats-0
Replica: nats-1, current, seen 31ms ago
Replica: nats-2, current, seen 33ms ago
State:
Last Delivered Message: Consumer sequence: 11,319,431 Stream sequence: 423,811,923 Last delivery: 36ms ago
Acknowledgment Floor: Consumer sequence: 11,319,431 Stream sequence: 423,811,923 Last Ack: 33ms ago
Outstanding Acks: 0 out of maximum 1,000
Redelivered Messages: 0
Unprocessed Messages: 0
Waiting Pulls: 7 of maximum 512
Connections:
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Top 6 Connections out of 6 by subs │
├─────┬──────────────────────────────┬────────┬─────────┬────────────────────┬─────────┬───────────┬─────────────┬─────────────┬──────────┬───────────┬──────┤
│ CID │ Name │ Server │ Cluster │ IP │ Account │ Uptime │ In Msgs │ Out Msgs │ In Bytes │ Out Bytes │ Subs │
├─────┼──────────────────────────────┼────────┼─────────┼────────────────────┼─────────┼───────────┼─────────────┼─────────────┼──────────┼───────────┼──────┤
│ 69 │ NATS CLI Version development │ nats-0 │ nats │ 10.244.1.102:42596 │ │ 0s │ 1 │ 0 │ 254 B │ 0 B │ 1 │
│ 49 │ │ nats-0 │ nats │ 10.244.0.239:54620 │ │ 13h26m54s │ 85,960,707 │ 4,583,239 │ 33 GiB │ 367 MiB │ 3 │
│ 42 │ │ nats-1 │ nats │ 10.244.0.143:33546 │ │ 13h24m28s │ 86,083,043 │ 4,592,787 │ 33 GiB │ 367 MiB │ 3 │
│ 59 │ │ nats-0 │ nats │ 10.244.1.48:47886 │ │ 13h24m28s │ 567,858 │ 550,868 │ 17 MiB │ 154 MiB │ 4 │
│ 41 │ │ nats-1 │ nats │ 10.244.0.247:47996 │ │ 13h24m28s │ 566,761 │ 549,739 │ 17 MiB │ 153 MiB │ 4 │
│ 60 │ │ nats-0 │ nats │ 10.244.0.178:36418 │ │ 13h24m28s │ 178,160,113 │ 162,865,183 │ 1.8 GiB │ 66 GiB │ 5 │
├─────┼──────────────────────────────┼────────┼─────────┼────────────────────┼─────────┼───────────┼─────────────┼─────────────┼──────────┼───────────┼──────┤
│ │ TOTALS FOR 6 CONNECTIONS │ │ │ │ │ │ 351,338,483 │ 173,141,816 │ 68 GIB │ 67 GIB │ 20 │
╰─────┴──────────────────────────────┴────────┴─────────┴────────────────────┴─────────┴───────────┴─────────────┴─────────────┴──────────┴───────────┴──────╯
╭────────────────────────────────╮
│ Connections per server │
├────────┬─────────┬─────────────┤
│ Server │ Cluster │ Connections │
├────────┼─────────┼─────────────┤
│ nats-1 │ nats │ 2 │
│ nats-0 │ nats │ 4 │
╰────────┴─────────┴─────────────╯
Memory: