fluent-bit
fluent-bit copied to clipboard
log missing rate is about 60% when setting buffer chunk size to 32k
Bug Report
Describe the bug fluentbit is used to upload logs into external servers. when using fluentbit 2.0.8 The log missing rate is about 60% with below settings buffer_chunk_size 32k buffer_max_size 32k
Solution #1: The log missing rate is about 0% with below settings. buffer_chunk_size 1M buffer_max_size 1M
Solution #2: using fluentbit 1.9.5/1.9.6, the log missing rate is about 0% without changing parameters. buffer_chunk_size 32k buffer_max_size 32k
https://docs.fluentbit.io/manual/pipeline/inputs/tail. The document is not clear to set the parameters, like Buffer_Chunk_Size and Buffer_Max_Size. Is it correct to apply such fix(Solution #1 changing buffer_chunk_size and buffer_max_size)? If yes, could you explain more technical details based on this?
To Reproduce
- Example configuration of fluentbit when the issue happened:
[INPUT]
name tail
tag event.kafka.ingress
alias kafka.ingress
**buffer_chunk_size 32k
buffer_max_size 32k**
read_from_head true
refresh_interval 5
rotate_wait 10
skip_empty_lines off
skip_long_lines true
key message
db /var/log/logshipper/kafka.ingress.db
db.sync normal
db.locking true
db.journal_mode off
path /var/log/aaa/ingress/*/*/*/*,/var/log/aaa/ingress/*/*/*/*/*,/var/log/aaa/ingress/*/*/*/*/*/*
exclude_path /var/log/logshipper/logshipper.log,/var/log/aaa/ingress/*.gz,/var/log/aaa/ingress/*.tgz
mem_buf_limit 20MB
parser json
ignore_older 11m
- Steps to reproduce the problem:
- generate lots of logs
- fluentbit to upload logs to external log server
- check the logs missing by counting the log records
Expected behavior No logs missings or close to 0 missing rate.
Your Environment
- Version used: 2.0.8, 1.9.7, 1.9.9, etc
- Configuration: above
- Environment name and version (e.g. Kubernetes? What version?): Kubernetes
Hello @hsingli20, Is this reproducible with 2.1.10? How are you measuring the loss? What's in the Fluent Bit log file? What's the load? What's the record size?
Thanks a lot, @lecaros .
Is this reproducible with 2.1.10?
2.0.8 is tested. Not verify 2.1.10. How are you measuring the loss? The log can be measured precisely, the loss is (a-b)/a, (a=how many request sent by jmeter, from jmeter.log, b=how many request are sent to backendsimulator) What's the load? it total produce about 9000 logs per second, What's the record size? 4~5k bytes in each log What's in the Fluent Bit log file? No error logs found. {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.631+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.tel] [static files] processed 0b"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.631+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [task] destroy task=0x7fb71f6875b0 (task_id=0)"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.636+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=33780"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.637+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 30.9K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.641+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=31705"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.642+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 29.8K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.643+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=79304"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.644+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=94775"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.645+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.tel] [static files] processed 169.5K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.650+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=33798"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.651+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 31.0K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.656+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=32949"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 30.9K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.tel] inode=6815791 file=/var/log/xxxxxx/tel/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-telserver/2023092809/group2/n170ggg_TrafficEventLog_xxxxxx_zzzngmsender_233_20230928093356472_113098.xml promote to TAIL_EVENT"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [ info] [input:tail:http.tel] inotify_fs_add(): inode=6815791 watch_fd=23 name=/var/log/xxxxxx/tel/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-telserver/2023092809/group2/n170ggg_TrafficEventLog_xxxxxx_zzzngmsender_233_20230928093356472_113098.xml"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.tel] inode=6815794 file=/var/log/xxxxxx/tel/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-telserver/2023092809/group2/n170ggg_TrafficEventLog_xxxxxx_zzzngmsender_233_20230928093359479_113099.xml promote to TAIL_EVENT"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [ info] [input:tail:http.tel] inotify_fs_add(): inode=6815794 watch_fd=24 name=/var/log/xxxxxx/tel/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-telserver/2023092809/group2/n170ggg_TrafficEventLog_xxxxxx_zzzngmsender_233_20230928093359479_113099.xml"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.657+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.tel] [static files] processed 0b, done"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.661+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=31991"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.662+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 30.1K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.667+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=33780"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.669+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 30.9K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.673+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=32408"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.675+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 30.6K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.680+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=33816"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.681+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 31.0K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.686+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=32639"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.687+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 30.5K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.691+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=30861"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.694+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 28.3K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.695+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input chunk] update output instances with new chunk size diff=6354"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 6.1K"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] inode=6815796 file=/var/log/xxxxxx/ingress/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-zzz-patm/2023092809/PushNotification/n170ggg_pushapplicationtrafficmgmt.log_202309280900 promote to TAIL_EVENT"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [ info] [input:tail:http.ingress] inotify_fs_add(): inode=6815796 watch_fd=8 name=/var/log/xxxxxx/ingress/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-zzz-patm/2023092809/PushNotification/n170ggg_pushapplicationtrafficmgmt.log_202309280900"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.ingress] [static files] processed 0b, done"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] inode=6815797 file=/var/log/xxxxxx/egress/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-zzz-ngmsender/2023092809/n170ggg_zzzngmsender.log_202309280900 promote to TAIL_EVENT"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [ info] [input:tail:http.egress] inotify_fs_add(): inode=6815797 watch_fd=4 name=/var/log/xxxxxx/egress/tenant_xxxxxx-T/ggg-ppsf-xxxxxx-zzz-ngmsender/2023092809/n170ggg_zzzngmsender.log_202309280900"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:02.696+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:02] [debug] [input:tail:http.egress] [static files] processed 0b, done"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.044+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815795 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.046+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=9436"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.047+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815795 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.231+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815795 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.233+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=10492"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.234+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815795 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.234+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815796 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.239+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=33807"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.243+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=19095"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.360+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.egress] inode=6815797 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.362+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=17582"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.364+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.egress] inode=6815797 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.368+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=32977"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.372+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=12879"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] created task=0x7fb71f687540 id=0 OK"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [output:http:http.ingress] task_id=0 assigned to thread #1"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] created task=0x7fb71f6875b0 id=1 OK"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [output:http:http.egress] task_id=1 assigned to thread #0"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] created task=0x7fb71f687620 id=2 OK"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.574+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [output:http:http.tel] task_id=2 assigned to thread #1"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.580+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [http_client] not using http_proxy for header"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.587+00:00", "severity": "warn", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.587+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ info] [output:http:http.tel] ggg-test-helm-xxxxxx-backend-simulator.stc-n170-ggg:2222, HTTP status=200"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.587+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [out flush] cb_destroy coro_id=0"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.587+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] destroy task=0x7fb71f687620 (task_id=2)"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.588+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [http_client] not using http_proxy for header"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.589+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [http_client] not using http_proxy for header"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.596+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.egress] inode=6815797 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.597+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=8636"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.601+00:00", "severity": "warn", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.601+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ info] [output:http:http.egress] ggg-test-helm-xxxxxx-backend-simulator.stc-n170-ggg:2222, HTTP status=200"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.601+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [out flush] cb_destroy coro_id=2"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.602+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.egress] inode=6815797 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.607+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=32032"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.608+00:00", "severity": "warn", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ warn] [http_client] cannot increase buffer: current=4096 requested=36864 max=4096"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.608+00:00", "severity": "info", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [ info] [output:http:http.ingress] ggg-test-helm-xxxxxx-backend-simulator.stc-n170-ggg:2222, HTTP status=200"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.608+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [out flush] cb_destroy coro_id=2"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.608+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] destroy task=0x7fb71f6875b0 (task_id=1)"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.610+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=11908"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.611+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [task] destroy task=0x7fb71f687540 (task_id=0)"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.671+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815796 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.673+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=8829"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.675+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815796 events: IN_MODIFY "} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.680+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=33789"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.684+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input chunk] update output instances with new chunk size diff=19104"} {"version": "1.0.0", "timestamp": "2023-09-28T09:34:03.913+00:00", "severity": "debug", "service_id": "xxxxxx-logtransformer", "message": "[2023/09/28 09:34:03] [debug] [input:tail:http.ingress] inode=6815796 events: IN_MODIFY "}
anyone can help us explain the usage of buffer_chunk_size and buffer_max_size? why does that work after increasing the size of buffer_chunk_size and buffer_max_size when one line of logs is quite large(about 4k or 5k)?
We face to same issue. Can anyone explain what values of below properties can help us?
We can upload following:
- Fluent Bit's log file with trace mode enables
- Backup of Buffer files (*.flb) for give tail input
- Files from 'Output File' plugin
More details about my environment:
- Fluent Bit version: v1.9.10
- How are you measuring the loss? -->We use Loki, as log_path is unique , it is easy
- What's the load? --> avg load for last 24h: 40
- What's the record size? .. Max line count is up to 5000 and max characters in longest line in file is up to 3000
- Common file size is between 2 an 3.5MB
Help: https://docs.fluentbit.io/manual/pipeline/inputs/tail
Current values we use:
[INPUT]
Name tail
Path /u02/app/*_DBA_HIST_*.out
Tag dbperf_stats
Alias tail.dbperf_stats
Path_Key fa.log_path
DB /var/lib/storage/dbperf_stats.db
DB.sync normal
DB.locking true
DB.journal_mode WAL
storage.type filesystem
Skip_Long_Lines On
Refresh_Interval 30
Rotate_Wait 30
Ignore_Older 15m
Read_from_Head False
Inotify_Watcher false
Mem_Buf_Limit 8MB
Buffer_Max_Size 2MB
Buffer_Chunk_Size 32K
Someone advised this: .. just to increase Mem_Buf_Limit to 64MB
[INPUT]
Name tail
Path /u02/app/*_DBA_HIST_*.out
Tag dbperf_stats
Alias tail.dbperf_stats
Path_Key fa.log_path
DB /var/lib/storage/dbperf_stats.db
DB.sync normal
DB.locking true
DB.journal_mode WAL
storage.type filesystem
Skip_Long_Lines On
Refresh_Interval 30
Rotate_Wait 30
Ignore_Older 15m
Read_from_Head False
Inotify_Watcher false
Mem_Buf_Limit 64MB
Buffer_Max_Size 2MB
Buffer_Chunk_Size 32K
The filer of this issue recommends to set Buffer_Chunk_Size and buffer_max_size to 1MB, So make sense to set these both properties to 2MB?
[INPUT]
Name tail
Path /u02/app/*_DBA_HIST_*.out
Tag dbperf_stats
Alias tail.dbperf_stats
Path_Key fa.log_path
DB /var/lib/storage/dbperf_stats.db
DB.sync normal
DB.locking true
DB.journal_mode WAL
storage.type filesystem
Skip_Long_Lines On
Refresh_Interval 30
Rotate_Wait 30
Ignore_Older 15m
Read_from_Head False
Inotify_Watcher false
Mem_Buf_Limit 8MB
Buffer_Max_Size 2MB
Buffer_Chunk_Size 2MB
Would be great if anyone explain how below 3 properties of the Tail plugin work and how to debug their behavior in Fluent Bit log.
Is there any general recommendation which can avoid to data loss?
Hello,
is this reproducible on a currently supported version? (either 2.1.x or 2.2.x)
In general, you shouldn't modify buffer_max_size
or buffer_chunk_size
. Why do you need to modify them?
If you are still able to reproduce, provide steps to do it so we can take a look at it.
Hello @lecaros ,
during the time I will prepare all stuff for you to reproduce the issue, could you please compare version we use 1.9.10 versus 2.1.x or 2.2.x, from memory/buffer/chunks point of view?
I changed buffer_chunk_size to same value per the filer of this Bug (buffer_max_size and buffer_chunk_size - both set to 1MB, see header of this Bug). So, could you please explain why it is not good idea to play with them? After the increase of buffer_chunk_size from 32KB (default) to 2MB I have seen much less data loss (1 file from 10) than before that.
Could you also explain below note from doc: https://docs.fluentbit.io/manual/administration/backpressure#storage.max_chunks_up
storage.max_chunks_up :
Please note that when storage.type filesystem is set, the Mem_Buf_Limit setting no longer has any effect,
instead, the [SERVICE] level storage.max_chunks_up setting controls the size of the memory buffer.
I my case we have set 'Mem_Buf_Limit 8MB', and 'storage.type filesystem' for Tail plugin,
and also for [SERVICE] level 'storage.max_chunks_up 32'
Does it mean that Mem_Buf_Limit make no sense to be set at all, when we have set 'storage.type filesystem' and storage.max_chunks_up ?
Thank you also to point out that our version 1.9.10 is already not supported. I will try the latest version. Ref. https://github.com/fluent/fluent-bit/security
@lecaros could you also update https://github.com/fluent/fluent-bit/discussions/5719 ?
Hello, were you able to test the latest version? These questions are still relevant if we want to troubleshoot this. Is this reproducible with 2.2.2? How are you measuring the loss? What's in the Fluent Bit log file? What's the load? What's the record size?
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
This issue was closed because it has been stalled for 5 days with no activity.