logstash-input-file icon indicating copy to clipboard operation
logstash-input-file copied to clipboard

Files on NFS volume vs sincedb

Open benoiton opened this issue 10 years ago • 7 comments

My logs are on a NFS volume. They are correctly parsed. After reboot, they are parsed again.

The reason is the sincedb format. This file identifies processed file with their major+minor+inode. minor is not the same each time the same NFS volume is mounted on the same NFS client. Therefore, the old files are seen as new files after reboot.

Why not identify processed files with full path ?

benoiton avatar May 18 '15 12:05 benoiton

This is indeed necessary also when using rsync as mentionned in pending PR https://github.com/jordansissel/ruby-filewatch/pull/34

wiibaa avatar May 18 '15 17:05 wiibaa

I haven't confirmed if this is still a bug, but I agree about the problem. Logstash should somehow detect that the file being watched is on a remote filesystem or allow users to explicitly follow files by path name (not implicit inode tracking).

jordansissel avatar Jul 23 '15 22:07 jordansissel

I can confirm this is still an issue - I was bulk importing from a NFS and after rebooting / remount files which were already processed were being again processed - I use a fixed sincedb file and noticed the minor value changed from 24 to 25.

splashx avatar Sep 12 '15 18:09 splashx

I think we can close this - read more.

splashx avatar Nov 02 '15 08:11 splashx

Except, we have not done the fingerprinting bit yet.

guyboertje avatar May 14 '18 16:05 guyboertje

Also showing in https://discuss.elastic.co/t/logstash-cant-read-some-files/143847

lucabelluccini avatar Oct 08 '18 12:10 lucabelluccini

I wrote a ruby ​​script to help me deal with this problem

this script wil modify inode info( which may changes after re_mount ) in sincedb file when using logstash's logstash-input-file plugin on nfs. https://gist.github.com/zhenchuan/10bd5eafb6c4058a83c17e053278d889

zhenchuan avatar Dec 31 '21 10:12 zhenchuan