fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

Fluent::Plugin::Buffer::BufferOverflowError

Open dduyon2 opened this issue 1 year ago • 1 comments

Describe the bug

When I send 15MB/s udp packets to fluent-bit and fluent-bit outs packets to Fluentd, after almost 5-6hours, this error has happened. I already read the Fluentd document and there are 4 controlled variables( flush_interval, workers, flush_thread_count, total_limit_size). I already had been changed these controlled variables, but bof happen again.

In my configuration, workers are 8. BOF occurs as soon as the total chunks size of the worker reaches 64K. How do I change the total chunks size per worker? Or How do I solve this problem? I don't think the 4 controlled variables do not help this problem.

To Reproduce

send udp packets by packet sender to Fluentd at 15MB/s.

Expected behavior

.

Your Environment

- Fluentd version:1.14.6
- Operating system: CentOS 7.9
- Kernel version:3.10.0-1160.15.2.el7.x86_64

Your Configuration

<system>
  workers 8
  log_level error
  <log>
    rotate_age 5
    rotate_size 104857600
  </log>
</system>

<source>
  @type forward
  tag from_bit
  port 43000
  bind [address]
</source>

<match from_bit.**>
  @type kafka2
  brokers [broker1]
  topic_key logStorage
  default_topic logStorage
  <format>
   @type  default-format
  </format>
  <buffer logStorage>
    @type file
    path /var/log/td-agent/buffer/td
    flush_mode interval
    flush_interval 5s
    chunk_limit_size 900000
    flush_thread_count 8
    total_limit_size 5120000000
  </buffer>
</match>

<label @FLUENT_LOG>
  <match **>
    @type file
    path /var/log/td-agent/info_log
  </match>
</label>

Your Error Log

2022-08-05 16:04:27 +0900 [error]: #6 unexpected error on reading data host="**.**.**.**" port=37066 error_class=Fluent::Plugin::Buffer::BufferOverflowError error="can't create buffer file for /var/log/td-agent/buffer/td/worker6/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - /var/log/td-agent/buffer/td/worker6/buffer.b5e57913aaef7106c1f6368c7bcd37ca6.log"
  2022-08-05 16:04:27 +0900 [error]: #6 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/buffer/file_chunk.rb:289:in `rescue in create_new_chunk'
2022-08-05 16:04:27 +0900 [error]: #6 failed to emit fluentd's log event tag="fluent.error" event={"host"=>"**.**.**.**", "port"=>37066, "error"=>"#<Fluent::Plugin::Buffer::BufferOverflowError: can't create buffer file for /var/log/td-agent/buffer/td/worker6/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - /var/log/td-agent/buffer/td/worker6/buffer.b5e57913aaef7106c1f6368c7bcd37ca6.log>", "message"=>"unexpected error on reading data host=\"**.**.**.**\" port=37066 error_class=Fluent::Plugin::Buffer::BufferOverflowError error=\"can't create buffer file for /var/log/td-agent/buffer/td/worker6/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - /var/log/td-agent/buffer/td/worker6/buffer.b5e57913aaef7106c1f6368c7bcd37ca6.log\""} error_class=Fluent::Plugin::Buffer::BufferOverflowError error="can't create buffer file for /var/log/td-agent/info_log/worker6/buffer.*.log. Stop creating buffer files: error = Too many open files @ rb_sysopen - /var/log/td-agent/info_log/worker6/buffer.b5e57913ab649c4f1c418e5f131fcde24.log"

Additional context

No response

dduyon2 avatar Aug 05 '22 07:08 dduyon2

@dduyon2

The actual cause is Too many open files @ rb_sysopen.

Have a large number of buffer files accumulated?

It seems that a large number of buffer files are created, and the maximum number of files that can be opened simultaneously by one process has been exceeded.

You can check the maximum number by $ ulimit -n. You can change this number, but if the throughput of transmitting is not keeping up with the throughput of receiving, then the more time that passes, the more buffer files accumulate.

If there are extra resources available, you need to increase the throughput of transmitting by increasing workers and flush_thread_count, or adjust the throughput of receiving.

daipom avatar Aug 24 '22 14:08 daipom

I close this now. if you have problems again, please reopen this.

daipom avatar Sep 27 '22 01:09 daipom