fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

Duplication of logs even with follow_inodes true

Open pstray opened this issue 3 years ago • 3 comments

Describe the bug

Log duplication occurs on log-rotation even when follow_inodes is true. The pos-file seems to contain the right entries from what I assume it should have.

To Reproduce

I have the following in a block:

@type tail
path /var/log/some.log*
exclude_path [ "/var/log/*.gz" ]
pos_file /var/log/td-agent/some.log.pos
follow_inodes true
read_from_head true

When some.log gets rotated to some.log.1, the whole contents of the file is resent.

Expected behavior

From the documentation of follow_inodes, I assumed this should work as expected and only new entries written to the rotated logfile (the ones written after the rotation but before the process logging to the file is HUPed) should have been sent.

Your Environment

- Fluentd version: 1.13.3
- TD Agent version: 4.2.0
- Operating system: Debian GNU/Linux 11 (bullseye)
- Kernel version: Debian 5.10.46-5 (2021-09-23) x86_64 GNU/Linux

Your Configuration

<source>
  @type tail
  path /var/log/some.log*
  exclude_path [ "/var/log/*.gz" ]
  pos_file /var/log/td-agent/some.log.pos
  follow_inodes true
  read_from_head true
</source>

Your Error Log

2021-10-13 00:01:20 +0200 [info]: #0 force flushing buffered events
2021-10-13 00:01:20 +0200 [info]: #0 detected rotation of /var/log/some.log.1; waiting 5 seconds
2021-10-13 00:01:20 +0200 [info]: #0 flushing all buffer forcedly
2021-10-13 00:01:20 +0200 [info]: #0 detected rotation of /var/log/some.log; waiting 5 seconds
2021-10-13 00:01:20 +0200 [info]: #0 following tail of /var/log/some.log
2021-10-13 00:02:10 +0200 [info]: #0 following tail of /var/log/some.log.1

Additional context

No response

pstray avatar Oct 13 '21 08:10 pstray

I've found a case to reproduce it with v1.13.3:

  • some.log.1 and some.log.2 aren't exist initially
  • 1st rotation doesn't cause duplication
    • move some.log to some.log.1
    • write new lines to some.log
  • On 2nd rotation, logs in some.log.2 are wrongly output
    • move some.log.1 to some.log.2
    • move some.log to some.log.1
    • write new lines to some.log

But it's not reproduce with current HEAD (90766d7c230ea31e51b2e6ab585960bea772eefa). #3541 seems fix this issue. Please try 1.14.3 after it's released.

ashie avatar Nov 04 '21 13:11 ashie

I'm getting logs duplication with follow_inodes with fluentd 1.14.5. In my case fluentd is used with splunk-connect-for-kubernetes solution. Logs are located on NFS share and their pattern is /var/log/nfs/*/mylog.log where * is some unique id.

Here's how config looks like:

  <source>
      @id tail.file.rundeck-access-logs
      @type tail
      @label @CUSTOM
      tag tail.file.rundeck_logs_access
      path /var/log/nfs/*/volumes/kubernetes.io~nfs/rundeck-logs/rundeck.access.log
      pos_file /fluentd/log/splunk-fluentd-rundeck-access-logs.pos
      read_from_head true
      follow_inodes true
      path_key source
      <parse>
        @type regexp
        expression /^(?<log>.*)$/
        time_key time
        time_type string
        time_format %Y-%m-%dT%H:%M:%SZ
      </parse>
    </source>

vzabawski avatar Mar 16 '22 10:03 vzabawski

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

github-actions[bot] avatar Jun 15 '22 10:06 github-actions[bot]

One cause of duplicate logs has been clarified and resolved with the following fix:

  • #4237

Please try v1.16.2 after the package is released. If you still experience duplicate logs with that version, please tell us again.

daipom avatar Jul 17 '23 15:07 daipom

Just in case, there is still a possibility of duplicate logs:

  • #4243

but I think it is not related to this issue and this issue is resolved by #4237. (because the whole content of one file was duplicated.)

daipom avatar Jul 17 '23 16:07 daipom

Hi, I'm still experiencing the same issue of duplicated logs for version v1.16.2:

2023-10-08 09:24:38 +0000 [info]: gem 'fluentd' version '1.16.2'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-concat' version '2.5.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-dedot_filter' version '1.0.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.15'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.2.5'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-grok-parser' version '2.6.2'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-json-in-json-2' version '1.0.2'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '3.2.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-parser-cri' version '0.1.1'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-prometheus' version '2.1.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.1'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2023-10-08 09:24:38 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'

Fluentd Config:

<source>
  @type tail
  path /var/log/docker/containers/*/*.log
  pos_file /var/log/docker/fluentd-containers.log.pos
  read_from_head true
  follow_inodes true
  <parse>
    @type json
  </parse>
</source>

jamslinger avatar Oct 08 '23 09:10 jamslinger

@jamslinger Thanks for your report. Could you tell us some points?

  • Are the duplicated logs an entire file? or part of a file?
  • Does the duplication occur around log rotation?

I think the entire file duplication problem is solved, but part of a file duplication problem still exists.

there is still a possibility of duplicate logs:

  • #4243

but I think it is not related to this issue and this issue is resolved by #4237. (because the whole content of one file was duplicated.)

daipom avatar Oct 15 '23 10:10 daipom