Fluentd old buffer file cannot flush to elasticsearch
Describe the bug
I run fluentd for a long time,input plugin config forward receive event, output pugin config elasticsearch
Recently I found there aure many residual old buffer file and cannot flush to elasticseach even if I restart fluentd
To Reproduce
run fluentd for a long time, and buffer output rate is not large than elasticsearch cluster write rate
Expected behavior
old buffer file that create at 2024 year both flush to elasticsearch normally
Your Environment
- Fluentd version: 1.16.8
- Package version: docker
- Operating system: Debian GNU/Linux 12 (bookworm)
- Kernel version: 5.4.61-050461-generic
Your Configuration
<label @FLUENT_LOG>
<match fluent.*>
@type null
</match>
</label>
<source>
@type forward
port 24224
skip_invalid_event true
</source>
<match **>
@type elasticsearch_dynamic
@id elasticsearch
host "elasticsearch"
ssl_verify false
validate_client_version true
reconnect_on_error true
index_name "${tag_parts[-1]}"
reload_connections false
reload_on_failure true
time_key "timestamp"
time_key_exclude_timestamp true
utc_index false
slow_flush_log_threshold 120.0
request_timeout 120s
bulk_message_request_threshold -1
suppress_type_name true
default_elasticsearch_version 7
<buffer>
@type "file_single"
path "/log/buffer/elasticsearch"
chunk_format text
total_limit_size 9G
chunk_limit_size 15M
retry_type periodic
retry_wait 60s
flush_mode interval
flush_interval 15s
flush_thread_count 4
retry_forever true
overflow_action block
</buffer>
</match>
<source>
@type prometheus
port 24231
metrics_path "/prometheus"
aggregated_metrics_path "/metrics"
</source>
<source>
@type prometheus_output_monitor
</source>
<system>
root_dir "/tmp/fluentd-buffers/"
rpc_endpoint "0.0.0.0:24444"
suppress_repeated_stacktrace true
ignore_same_log_interval 60s
ignore_repeated_log_interval 60s
emit_error_log_interval 60s
</system>
Your Error Log
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c71e00ab834f19da9d5db50bd634b.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c7ea812c3695eb6cc161bea831269.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c7eabb2f376bbd62ff6ccfef64390.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c7ef249400acd82b6d598e606feb2.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c7ef91a6bad2fe4a11961bcd606c3.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c7efecd58a8b9a0bf75c55a257448.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c80a0ef5503ff376690732b2d5ad4.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c80a51a8796a8df795a173129407f.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c80a9184ca99991197671ec212dcd.buf
2025-05-12 09:26:19 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c84ec8fbed06daa461ba22a545b60.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c84f087dd2ab57d0208527f64555a.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c84f593d35209eabdb9e6bed40514.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c8567e96570f0671e9e54aae2b0fe.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c8572a762f5b167529536fec2b847.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c85805a39aca7d469618b57d2427a.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c85e586f14c778b36452245446937.buf
2025-05-12 09:26:20 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c85e9e2cd0faca0688dfabc35a55e.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c85f052485f8ce9c1cc8fd19ce395.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c85f302bfe8668e18663093e96f9b.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c86384b198fe7cb900a53f766ca27.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c863c535025520f5d165fedbde326.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c863f8d8e30abe7506a652f2c7791.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c87e9d8833fcaf9278a14a5ed8f92.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c87eac0bdb01332586cf6eed10e5f.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c87eafb484e72f728622dc5c04ce3.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c8807801edb18ca4855922029530f.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c880b6279b4ab31274efa7ce5b163.buf
2025-05-12 09:26:21 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c8811ca7f52996ad570bb2b3846d2.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c881c5ff3eff403c83c5c96fe8771.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c89a80af066cc9cb9f2210140543d.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c89aaa881e45ef11b2b84647678cf.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.pod.openstack.b622c89ada4fe632312b3a241ef5eb634.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.syslog.system.b634ece54e21704344a7bc57047a62b55.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] restoring buffer file: path = /log/buffer/elasticsearch/worker2/fsb.volume.audit.b634ece57b786e69735c67b06a6a0a6c6.buf
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] buffer started instance=2520 stage_size=46423238 queue_size=0
2025-05-12 09:26:22 +0000 [debug]: #2 fluent/log.rb:341:debug: listening prometheus http server on http:://0.0.0.0:24233//prometheus for worker2
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] flush_thread actually running
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] flush_thread actually running
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] flush_thread actually running
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] flush_thread actually running
2025-05-12 09:26:22 +0000 [debug]: #2 fluent/log.rb:341:debug: Start async HTTP server listening http://0.0.0.0:24233
2025-05-12 09:26:22 +0000 [debug]: #2 [elasticsearch] enqueue_thread actually running
2025-05-12 09:26:22 +0000 [debug]: #2 fluent/log.rb:341:debug: 0.0s: Async::IO::Socket
| Binding to #<Addrinfo: 0.0.0.0:24233 TCP>
2025-05-12 09:26:22 +0000 [info]: #2 fluent/log.rb:362:info: listening port port=24224 bind="0.0.0.0"
2025-05-12 09:26:22 +0000 [info]: #2 fluent/log.rb:362:info: fluentd worker is now running worker=2
Additional context
-rw-r--r-- 1 root root 15140776 Sep 23 2024 fsb.pod.openstack.b622c89a80af066cc9cb9f2210140543d.buf -rw-r--r-- 1 root root 15503561 Sep 23 2024 fsb.pod.openstack.b622c89aaa881e45ef11b2b84647678cf.buf -rw-r--r-- 1 root root 15679860 Sep 23 2024 fsb.pod.log.b622c8a645e8e4e8279b5f6c96b6fd8d2.buf -rw-r--r-- 1 root root 15097368 Sep 23 2024 fsb.pod.log.b622c8a6afe0c4a68f1421cbd725d23da.buf -rw-r--r-- 1 root root 15140143 Sep 23 2024 fsb.pod.log.b622c8a75e18ff5a710060ac859220fc7.buf -rw-r--r-- 1 root root 15043892 Sep 23 2024 fsb.pod.log.b622c8c27709c06be4b61c188b51203ff.buf -rw-r--r-- 1 root root 15142960 Sep 23 2024 fsb.pod.log.b622c8c2aba3e3b0156fe869a3ca1b34f.buf -rw-r--r-- 1 root root 15001516 Sep 23 2024 fsb.pod.log.b622c8c2e83a15ae83de4e8265b20d8ea.buf -rw-r--r-- 1 root root 14945271 Sep 23 2024 fsb.pod.log.b622c8c7827dd8b1aad3dd9d13034334e.buf -rw-r--r-- 1 root root 14993627 Sep 23 2024 fsb.pod.log.b622c8c7924e717059032fe5f604328d5.buf -rw-r--r-- 1 root root 14961563 Sep 23 2024 fsb.pod.log.b622c8c7b1c9e14ebc5b1b4e781ee9e48.buf -rw-r--r-- 1 root root 15029793 Sep 23 2024 fsb.pod.log.b622c8c7c4a55ef03876af1a7011b8eae.buf -rw-r--r-- 1 root root 15027225 Sep 23 2024 fsb.pod.log.b622c8c7e52a7a0fb03731cc72a94d2cc.buf -rw-r--r-- 1 root root 15400896 Sep 23 2024 fsb.pod.log.b622c8d01b999f82a2fc6b4f1415b4821.buf -rw-r--r-- 1 root root 15427766 Sep 23 2024 fsb.pod.log.b622c8d07d331a17c8b7866b4440aeeb2.buf -rw-r--r-- 1 root root 15319755 Sep 23 2024 fsb.pod.log.b622c8d0b05891029d86404a103e503f6.buf -rw-r--r-- 1 root root 15384163 Sep 23 2024 fsb.pod.log.b622c8d0e1e937b7eddc4a3609069dbeb.buf -rw-r--r-- 1 root root 15377167 Sep 23 2024 fsb.pod.log.b622c8d1205ad018018c335779963dbec.buf -rw-r--r-- 1 root root 3552 May 12 09:34 fsb.syslog.system.b634ed06629e693e14cdb8272eddde615.buf -rw-r--r-- 1 root root 525598 May 12 09:34 fsb.pod.log.q634ed05d8a6727806be186b60af1e2fe.buf -rw-r--r-- 1 root root 2062 May 12 09:34 fsb.kubelet.kubernetes.b634ed0641914ab0d55a354cda45eef16.buf
@luckyzzr Thanks for your report.
The cause is the existence of duplicate staged chunk files with the same key.
When using file_single buffer, the key is included in the filename.
For example, the key of the following chunk file is pod.openstack.
fsb.pod.openstack.b622c89a80af066cc9cb9f2210140543d.buf
The keys of staged chunks must be unique. New events are added to the chunk for the corresponding key, or if that key does not exist, a new chunk is created. That chunk is queued after 15 seconds and flushed immediately and gone, according to the following settings:
flush_mode interval flush_interval 15s
If duplicate keys exist, Fluentd will not recognize the chunk. The chunk is not processed at all. This is why the old files continue to remain after the restart.
It's hard to imagine this happening normally. Have you moved any files manually?
You can flush these chunks as follows.
- Add
flush_at_shutdownto the buffer setting.<buffer> @type "file_single" ... flush_at_shutdown </buffer> - Repeat restarts until all chunk files are gone.
- Each time it starts and stops, Fluentd will flush one chunk file per key.
- Need
flush_at_shutdownto flush at stopping for sure.
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days
This issue was automatically closed because of stale in 7 days