Kubernetes memory leak with tail input plugin, http and es output plugins
Memory leak withTail input plugin, HTTP and ElasticSearch output plugins
Description
Fluent Bit (4.0.7 via Helm chart 0.53.0) exhibits continuous RAM growth. Memory consumption never stabilizes and eventually OOMs. There is a high load Kubernetes cluster with high amount of logs.
Configuration
service:
daemon: off
flush: 1s
log_level: info
parsers_file: /fluent-bit/etc/parsers.conf
storage.path: /var/log/fb-storage/
storage.metrics: on
storage.checksum: on
storage.sync: normal
storage.max_chunks_up: 64
storage.backlog.mem_limit: 200M
storage.delete_irrecoverable_chunks: on
http_server: on
http_listen: 0.0.0.0
http_port: 2020
health_check: on
refresh_interval: 5s
parsers:
- name: custom-tag
format: regex
regex: '^(?<namespace_name>[^\.]+)\.(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)\.(?<container_name>.+)'
- name: nginx-ingress
format: regex
regex: '^(?<clientip>[^ ]*) - (?<client_identity>[^ ]*) \[(?<timestamp>[^\]]*)\] "(?<verb>\S+)(?: +(?<request>[^\"]*?)(?: +(?<httpversion>\S+))?)?" (?<response>[^ ]*) (?<bytes_sent>[^ ]*) "(?<referrer>[^\"]*)" "(?<user_agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^\]]*)\] \[(?<proxy_alternative_upstream_name>[^\]]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<req_id>[^ ]*)$'
time_key: timestamp
time_format: "%d/%b/%Y:%H:%M:%S %z"
multiline_parsers:
- name: multiline_json
type: regex
flush_timeout: 2000
key_content: log
rules:
- state: start_state
regex: '^\{.*$'
next_state: cont
- state: cont
regex: '^\s+.*$'
next_state: cont
- state: cont
regex: '^\}$'
next_state: start_state
pipeline:
inputs:
- name: tail
storage.type: filesystem
path: /var/log/containers/*.log
db: /var/log/fb-storage/flb.db
multiline.parser: docker, cri
tag_regex: '(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<container_id>[a-z0-9]{64})\.log$'
tag: kube.<namespace_name>.<pod_name>.<container_name>
key: log
mem_buf_limit: 150M
buffer_chunk_size: 50M
buffer_max_size: 150M
refresh_interval: 60
ignore_older: 1d
skip_empty_lines: on
filters:
- name: kubernetes
regex_parser: custom-tag
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
kube_tag_prefix: kube.
merge_log: off
keep_log: on
buffer_size: 1M
k8s-logging.parser: on
k8s-logging.exclude: off
- name: grep
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
exclude: log ^$
- name: multiline
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
multiline.parser: multiline_json
emitter_storage.type: memory
emitter_mem_buf_limit: 100M
- name: modify
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
add: kubernetes_cluster rke2-dc8
alias: add_cluster_label
- name: nest
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
operation: lift
nested_under: kubernetes
add_prefix: kubernetes_
- name: modify
match_regex: '^kube.[a-z0-9-]+\.ingres[a-z0-9-]+\.[a-z-]+$'
copy: log nginx_parsed_log
- name: parser
match_regex: '^kube.[a-z0-9-]+\.ingres[a-z0-9-]+\.[a-z-]+$'
key_name: nginx_parsed_log
parser: nginx-ingress
reserve_Data: On
- name: rewrite_tag
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
rule:
- $kubernetes_labels['fluentbit'] ^(.+)$ $TAG.elastic true
- kubernetes_pod_name ingres-nginx-external $TAG.elastic true
emitter_name: re_emitted
emitter_storage.type: memory
emitter_mem_buf_limit: 100M
- name: grep
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.vault.+\.elastic'
exclude: kubernetes_container_name vault.*
- name: modify
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+\.elastic'
rename: log message
- name: lua
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+\.elastic'
script: /fluent-bit/scripts/add_field.lua
call: add_field
outputs:
- name: http
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
host: 10.20.0.54
port: 9428
compress: gzip
uri: /insert/jsonline?_stream_fields=stream,kubernetes_pod_name,kubernetes_container_name,kubernetes_namespace_name&_msg_field=log&_time_field=date
format: json_lines
json_date_format: iso8601
header:
- AccountID 0
- ProjectID 0
retry_Limit: 3
storage.total_limit_size: 1GB
- name: es
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+.elastic'
host: ${FB_ELASTIC_HOST}
port: 9200
hTTP_User: ${FB_ELASTIC_USER}
hTTP_Passwd: ${FB_ELASTIC_PASSWORD}
tls: on
tls.verify: off
type: _doc
logstash_Prefix: fb-logs
logstash_Format: On
suppress_Type_Name: On
replace_Dots: On
generate_id: On
retry_Limit: 2
storage.total_limit_size: 1GB
trace_error: On
Memory Usage
Steps to Reproduce
- Deploy Fluent Bit with the configuration above.
- Observe memory usage over 48 hours.
- Memory grows linearly without a plateau.
Expected Result
RAM usage stabilizes after initial allocation.
Actual Result
RAM usage increases continuously until OOM.
Questions
- Is there a workaround to prevent memory leak in the tail input?
- Maybe i'm doing something wrong?
Is it reproducible on the latest versions of 4.0 or 4.1 series?
@patrick-stephens, it is reproducible on versions 3.2.* and 4.0.7. I didn't try on the latest versions, because 4.0.7 is the latest in the official Helm chart of Fluent Bit Here you can find the official Helm chart https://artifacthub.io/packages/helm/fluent/fluent-bit/0.53.0
I've tried to update app version to 4.1.0, memory leak still present
I'm just wondering why the twice of applying multiline is needed in your conf:
pipeline:
inputs:
- name: tail
storage.type: filesystem
path: /var/log/containers/*.log
db: /var/log/fb-storage/flb.db
multiline.parser: docker, cri
tag_regex: '(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<container_id>[a-z0-9]{64})\.log$'
tag: kube.<namespace_name>.<pod_name>.<container_name>
key: log
mem_buf_limit: 150M
buffer_chunk_size: 50M
buffer_max_size: 150M
refresh_interval: 60
ignore_older: 1d
skip_empty_lines: on
filters:
- name: kubernetes
regex_parser: custom-tag
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
kube_tag_prefix: kube.
merge_log: off
keep_log: on
buffer_size: 1M
k8s-logging.parser: on
k8s-logging.exclude: off
- name: grep
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
exclude: log ^$
- name: multiline
match_regex: '^kube.[a-z0-9-]+\.[a-z0-9-]+\.[a-z-]+$'
multiline.parser: multiline_json
emitter_storage.type: memory
emitter_mem_buf_limit: 100M
This could cause piling up the intermediate status of buffers. So, it could cause high memory usage.
@cosmo0920 Hi there. Thank you for the reply The first multiline (docker, cri) removes timestamps and "stdout F" symbols at the front of each log (e.g 2025-10-21T10:16:08.195218531Z stdout F ) I'll try to remove it in another way and come back with the result tomorrow.
try remove ignore_older.
Is the issue resolved? I met the memory leak issue in 4.1.1. The memory usage of the pod in EKS cluster increase until OOM.
Same here.
I encountered the same issue in EKS clusters,
The aws-for-fluent-bit version 3.0.0, which is built based on Fluent Bit v4.1.1, is encountering the same OOM (Out-of-Memory) issue.