rotation detection fails if lastedited time of rotated file and new logfile are the same -> creates false CRITICAL states
Hello,
I encountered a bug in the rotation handling of check_logfiles which occures when the rotated file has the same timestamp for last edited as the logfile itself. This frequently happens to us when using the logback project (http://logback.qos.ch/) because this engine rotates lazy: if we configure a rotation at midnight each day the log is rotated, when the first log entry is to be written after midnight. When this happens, the log is rotated and the line to be logged is then written into the (new) logfile. This usually means, both files have the same timestamp.
The bug is reproducible. I have created a test script to reproduce the issue. Please see https://gist.github.com/betagan/3bf1356adb98b580e91a
You will need a check_logfiles executable in the same directory as the script or change the CLCMD variable in the top. Other than that, just execute the script.
This shows, that although we only add "OK" type lines after the rotation, we still get a CRITICAL return code for the second execution of check_logfiles after the rotation.
example output is shown here: https://gist.github.com/betagan/df30f9010174333863da
we tracked down the issue to the following line in the check_logfiles executable, found in the collectfiles sub:
if ((stat $archive)[9] >=
$self->{laststate}->{logtime}) {
We have (temporarily) changed this to be > instead of >= but we don't know if this breaks anything down the road. Any comment regarding this is very much welcomed.
all tests performed with check_logfiles v3.6.2.1
We are happy to provide more information if required.
You're right! I'll examine this.
We had the same problem and made the same modification as betagan, but it does not work for type=rotating::uniform.
The debug logs are as follows:
Thu Sep 19 07:07:30 2019: ==================== /var/log/backup/dummy ==================
Thu Sep 19 07:07:30 2019: the newest uniform logfile i found is /var/log/backup/daily.log-20190919
Thu Sep 19 07:07:30 2019: rewrote uniform seekfile to /var/tmp/check_logfiles/check_logfiles._var_log_backup_uniformlogfile.daily.log
Thu Sep 19 07:07:30 2019: found seekfile /var/tmp/check_logfiles/check_logfiles._var_log_backup_uniformlogfile.daily.log
Thu Sep 19 07:07:30 2019: LS lastlogfile = /var/log/backup/daily.log-20190919
Thu Sep 19 07:07:30 2019: LS lastoffset = 79388 / lasttime = 1568743901 (Wed Sep 18 03:11:41 2019) / inode = 66305:12993010
Thu Sep 19 07:07:30 2019: found private state $VAR1 = {
'runcount' => 5,
'lastruntime' => 1568671650,
'logfile' => '/var/log/backup/daily.log-20190918'
};
Thu Sep 19 07:07:30 2019: this is not the same logfile 66305:12993010 != 66305:13082671
Thu Sep 19 07:07:30 2019: Log offset: 79388
Thu Sep 19 07:07:30 2019: looking for rotated files in /var/log/backup with pattern daily\.log-\d{8}
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190914 matches (modified Sat Sep 14 03:39:31 2019 / accessed Wed Sep 18 00:03:04 2019 / inode 12606024 / inode changed Sat Sep 14 03:39:31 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190915 matches (modified Sun Sep 15 03:10:46 2019 / accessed Wed Sep 18 00:03:04 2019 / inode 13079971 / inode changed Sun Sep 15 03:10:46 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190916 matches (modified Mon Sep 16 03:18:51 2019 / accessed Thu Sep 19 00:02:54 2019 / inode 13082655 / inode changed Mon Sep 16 03:18:51 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190917 matches (modified Tue Sep 17 03:27:36 2019 / accessed Thu Sep 19 00:02:54 2019 / inode 13082667 / inode changed Tue Sep 17 03:27:36 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190918 matches (modified Wed Sep 18 03:11:41 2019 / accessed Wed Sep 18 07:07:31 2019 / inode 12993010 / inode changed Wed Sep 18 03:11:41 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190919 matches (modified Thu Sep 19 03:16:46 2019 / accessed Thu Sep 19 03:15:09 2019 / inode 13082671 / inode changed Thu Sep 19 03:16:46 2019)
Thu Sep 19 07:07:30 2019: archive /var/log/backup/daily.log-20190919 was modified after Wed Sep 18 03:11:41 2019
Thu Sep 19 07:07:30 2019: opened logfile /var/log/backup/daily.log-20190919
Thu Sep 19 07:07:30 2019: logfile /var/log/backup/daily.log-20190919 (modified Thu Sep 19 03:16:46 2019 / accessed Thu Sep 19 03:15:09 2019 / inode 13082671 / inode changed Thu Sep 19 03:16:46 2019)
Thu Sep 19 07:07:30 2019: first relevant files: daily.log-20190919, daily.log-20190919
Thu Sep 19 07:07:30 2019: /var/log/backup/daily.log-20190919 has fingerprint 66305:13082671:81593
Thu Sep 19 07:07:30 2019: /var/log/backup/daily.log-20190919 has fingerprint 66305:13082671:81593
Thu Sep 19 07:07:30 2019: skipping /var/log/backup/daily.log-20190919 (identical to /var/log/backup/daily.log-20190919)
Thu Sep 19 07:07:30 2019: relevant files: daily.log-20190919
Thu Sep 19 07:07:30 2019: moving to position 79388 in /var/log/backup/daily.log-20190919
Thu Sep 19 07:07:30 2019: stopped reading at position 81593
Thu Sep 19 07:07:30 2019: keeping position 81593 and time 1568830606 (Thu Sep 19 03:16:46 2019) for inode 66305:13082671 in mind
Because daily.log-20190918 is not included in @rotatedfiles, daily.log-20190919 is read from the last offset of daily.log-20190918, not from the beginning. This can be avoided by including files which has (1) mtime greater than previous logtime, or (2) mtime equals previous logtime AND size equals previous offset, i.e.
if ((stat $archive)[9] > $self->{laststate}->{logtime} ||
(stat $archive)[9] == $self->{laststate}->{logtime} &&
$self->getfilesize($archive) == $self->{laststate}->{logoffset}) {
The above change does not work if more than one rotation occurs within one second, but it is a vary rare case and can be ignored, I think.