fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

memory leak with `pattern not matched` error

Open Watson1978 opened this issue 6 months ago • 6 comments

Describe the bug

When many logs do not match the pattern, memory usage increases significantly.

To Reproduce

  • conf
<source>
  @type tail
  tag test
  path "#{File.expand_path '~/tmp/fluentd/access-*.log'}"
  refresh_interval 5s
  read_from_head true
  <parse>
    @type regexp
    expression /^(?<log>a+)+$/
    timeout 0.1s
  </parse>
</source>

<match **>
  @type stdout
</match>
  • script to generate log data
# frozen_string_literal: true
require "fileutils"

FILE_MAX_SIZE = 100 * 1024 * 1024
FILE_PATH = File.expand_path "~/tmp/fluentd/access-1.log"

dir = File.dirname(FILE_PATH)
FileUtils.mkdir_p(dir)

File.open(FILE_PATH, "w") do |f|

  data = "#{'a' * 1024}X"
  loop do
    f.puts data
    break if File.size(FILE_PATH) > FILE_MAX_SIZE
  end
end

When try to reproduce, fluentd will display the warning log as following and increase memory usage during handling the message.

2025-06-04 18:18:37 +0900 [info]: following tail of /home/watson/tmp/fluentd/access-1.log
2025-06-04 18:18:37 +0900 [warn]: pattern not matched: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaX"
2025-06-04 18:18:37 +0900 [warn]: pattern not matched: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaX"
...

Memory usage increases linearly while handling the relevant logs.

Image

Expected behavior

Less memory usaage

Your Environment

- Fluentd version:
- Package version:
- Operating system:
- Kernel version:

Your Configuration

<source>
  @type tail
  tag test
  path "#{File.expand_path '~/tmp/fluentd/access-*.log'}"
  refresh_interval 5s
  read_from_head true
  <parse>
    @type regexp
    expression /^(?<log>a+)+$/
    timeout 0.1s
  </parse>
</source>

<match **>
  @type stdout
</match>

Your Error Log

see above

Additional context

No response

Watson1978 avatar Jun 04 '25 09:06 Watson1978

workaround 1

As workaround, looks like this issue can be avoided by specifying tags strictly with match, like

<match test>
  @type stdout
</match>

instead of <match **>.

workaround 2

If we have used label properly, looks like we can avoid this issue.

<source>
  @type tail
  tag test
  @label @sample ## add
  path "#{File.expand_path '~/tmp/fluentd/access-*.log'}"
  refresh_interval 5s
  read_from_head true
  <parse>
    @type regexp
    expression /^(?<log>a+)+$/
    timeout 0.1s
  </parse>
</source>

<label @sample> ## add
  <match **>
    @type stdout
  </match>
</label>

Watson1978 avatar Jun 05 '25 05:06 Watson1978

I see...

Looks like FluentLogEventRouter has some memory issues.

daipom avatar Jun 05 '25 06:06 daipom

If we implement https://github.com/fluent/fluentd/issues/1504, it might improve this issue.

Watson1978 avatar Jun 09 '25 01:06 Watson1978

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

github-actions[bot] avatar Jul 09 '25 10:07 github-actions[bot]

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

github-actions[bot] avatar Aug 09 '25 10:08 github-actions[bot]

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

github-actions[bot] avatar Oct 01 '25 10:10 github-actions[bot]