helm-charts
helm-charts copied to clipboard
Aggregator Not Sending Logs to outputs After Running for a Few Hours
Issue Description:
Problem: After deploying the Fluent Bit Aggregator Helm Chart and running it for a few hours, it stops sending logs to Elasticsearch and Syslog, which are the intended destinations for log forwarding.
Expected Behavior: The Fluent Bit Aggregator should consistently and reliably forward logs to the specified Elasticsearch and Syslog destinations as configured in the Helm Chart.
Steps to Reproduce:
Deploy Fluent Bit Aggregator using the provided Helm Chart. Monitor the log forwarding functionality for a few hours. Observe that log forwarding to Elasticsearch and Syslog ceases after a certain period. Actual Results: After an initial period of successful log forwarding, Fluent Bit Aggregator stops sending logs to Elasticsearch and Syslog without any apparent errors or warnings.
Environment Details:
Kubernetes Cluster Version: 1.26 Fluent Bit Agents Version: 2.1.8 Fluent Bit Aggregator Version: 2.1.9 Elasticsearch Version: 8.9
aggregator config:
[SERVICE]
daemon false
http_Port 2020
http_listen 0.0.0.0
http_server true
log_level debug
parsers_file /fluent-bit/etc/parsers.conf
storage.metrics true
storage.path /fluent-bit/data
[INPUT]
name forward
listen 0.0.0.0
port 24224
[FILTER]
Name rewrite_tag
Match kube.*
Rule $syslog ^(true)$ syslog.* true
Emitter_Name re_emitted
[OUTPUT]
Name syslog
Match syslog.*
Host $HOST
Port 514
Retry_Limit false
Mode tcp
Syslog_Format rfc5424
Syslog_MaxSize 65536
Syslog_Hostname_Key hostname
Syslog_Appname_Key appname
Syslog_Procid_Key procid
Syslog_Msgid_Key msgid
Syslog_SD_Key uls@0
Syslog_Message_Key msg
[OUTPUT]
Name es
Match kube.*
HTTP_User $USER
HTTP_Passwd $PASS
tls Off
tls.verify Off
Host elastic-elasticsearch
Port 9200
Retry_Limit False
Trace_Error On
Trace_Output Off
Suppress_Type_Name On
Replace_Dots On
Buffer_Size False
Logstash_Prefix logstash
Logstash_Format On
Index logstash
Generate_ID On
Write_Operation upsert
[OUTPUT]
Name es
Match host.*
HTTP_User $USER
HTTP_Passwd $PASS
tls Off
tls.verify Off
Host elastic-elasticsearch
Port 9200
Retry_Limit False
Trace_Error On
Trace_Output Off
Suppress_Type_Name On
Replace_Dots On
Buffer_Size False
Logstash_Prefix logstash
Logstash_Format On
Index logstash
Write_Operation upsert
Generate_ID On
fluent-bit agents config:
custom_parsers.conf:
----
[PARSER]
Name docker_no_time
Format json
Time_Keep Off
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
[FILTER]
Name grep
Match *
Exclude log liveness
[FILTER]
Name grep
Match *
Exclude log readiness
[SERVICE]
Daemon Off
Flush 5
Log_Level debug
Parsers_File /fluent-bit/etc/parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
Health_Check On
[INPUT]
Name tail
Path /var/log/containers/*.log
Exclude_Path /var/log/containers/*_monitoring_*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 50MB
Buffer_Max_Size 1MB
Skip_Long_Lines Off
[INPUT]
Name systemd
Tag host.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Read_From_Tail On
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On
[OUTPUT]
Name forward
Match *
Host fluent-bit-aggregator
Port 24224
fluent-bit aggregator statefulset logs while no logs is sent to elastic
[2023/09/17 12:07:58] [debug] [out flush] cb_destroy coro_id=7942
[2023/09/17 12:07:58] [debug] [retry] re-using retry for task_id=1959 attempts=19
[2023/09/17 12:07:58] [ warn] [engine] failed to flush chunk '1-1694939682.183824748.flb', retry in 1069 seconds: task_id=1959, input=forward.0 > output=es.1 (out_id=1)
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=1354 assigned to thread #1
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=1642 assigned to thread #0
[2023/09/17 12:07:59] [debug] [output:es:es.1] task_id=685 assigned to thread #1
[2023/09/17 12:07:59] [debug] [upstream] KA connection #96 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [upstream] KA connection #91 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
[2023/09/17 12:07:59] [debug] [upstream] KA connection #89 to elastic-elasticsearch:9200 has been assigned (recycled)
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [out_es] converted_size is 0
[2023/09/17 12:07:59] [debug] [http_client] not using http_proxy for header
Kibana view of the logs
Attaching a link of the issue raised in @stevehipwell helm git repo https://github.com/stevehipwell/helm-charts/issues/789