lnav
lnav copied to clipboard
timestamp/log_time issue in access_log
lnav version lnav 0.10.0, macos 11.5.2, macbook pro M1
Describe the bug I have an apache log file. I am simply trying to see daily traffic aggregates:
;select date(log_time) as d, count(1) as hits from access_log group by d order by hits desc
d hits
2021-08-17│120177 <- completely wrong
2021-02-11│ 4590
2021-07-10│ 4375 <- totally different in summarize
2021-02-12│ 3440
2021-02-10│ 3114
2021-04-08│ 2415
2021-06-22│ 2147
2021-06-14│ 1344
2021-04-07│ 1232
or
:summarize date(log_time)
c_date(log_time) count_date(log_time)
2021-08-17 │ 118531 <- completely wrong
2021-02-11 │ 4590
2021-02-12 │ 3440
2021-02-10 │ 3114
2021-07-10 │ 2791
2021-04-08 │ 2415
2021-06-22 │ 2147
2021-06-14 │ 1344
2021-04-07 │ 1231
First of all the last day of the log 17 aug is totally off:
$ rg -c 17/Aug forum.log
80
;select count(*) from access_log where date(log_time) = '2021-08-17';
✔ SQL Result: 120177
Maybe i just don't understand how summarize and date() work, but the other days have also differences:
$ rg -c 11/Feb forum.log
9222
;select count(*) from access_log where date(log_time) = '2021-02-11';
✔ SQL Result: 4590
while we are here, i find it somewhat confusing that p shows timestamp, but this column does not exist and cannot be used in queries...
right, so i think what's confusing lnav is that this logfile is a concatenation of many log files and the dates can be out of order:
dec
jan
feb
...
dec
jan
feb
...
i have seen that lnav tries to be smart about delayed log records as well which is quite common in web logs. the logline is written after the request has finished which can make records out of order. so i am not sure i am breaking something here by not having a linear time log file :}
the reason for this is that i am putting together multiple log files from multiple front proxies. i am not sure it's trivial to merge these files.
after i separated the files, the sql query results now mach the grep results. however summarize is still showing different numbers:
c_date(log_time) count_date(log_time)
2021-07-10 │ 2610
2021-04-08 │ 2411
2021-06-22 │ 2147
2021-06-14 │ 1342
2021-04-07 │ 1212
;select date(log_time) as d, count(1) as hits from access_log group by d order by hits desc
d hits
2021-07-10│4194
2021-04-08│2411
2021-06-22│2147
2021-06-14│1342
2021-04-07│1213
% rg -c 10/Jul forum.log
4194
% rg -c 08/Apr forum.log
2411
% rg -c 22/Jun forum.log
2147
% rg -c 14/Jun forum.log
1342
% rg -c 07/Apr forum.log
1213
further down i see more differences between summarize and the query so summarize must be doing something else as well. i guess they must be counting something else (count_date(log_time))? i might be abusing summarize with the function call...
oh wow, lnav just keeps surprising me...
$ lnav log1 log2 ...
while we are here, i find it somewhat confusing that p shows timestamp, but this column does not exist and cannot be used in queries...
I've pushed a fix for this.
right, so i think what's confusing lnav is that this logfile is a concatenation of many log files and the dates can be out of order:
Yes, lnav does not always resort the log messages as they are "received" since that can be costly. In cases where the timestamp in the log message is before the previous log message, the new message gets the same "received" timestamp as the previous message. This is indicated in the display by coloring the timestamp as yellow and adding an overlay that shows the time difference:

(It looks like I broke this functionality in v0.10.0, it'll be fixed in the upcoming release)
In the SQL interface, you can access the original timestamp from the log file using the log_actual_time column.
Did something give you the impression that you needed to concat the log files together? Collating multiple log files together by their timestamp is usually one of the highlighted features in lnav's documentation.
Did something give you the impression that you needed to concat the log files together? Collating multiple log files together by their timestamp is usually one of the highlighted features in lnav's documentation.
i use log analyzers like goaccess a lot. they don't care about out of order and it makes my life easier in some cases just to concat all the server files into one. to be fair goaccess also accepts multiple input files so i am now stopping with this practice, it just adds confusion in the end.
Can this bug be closed now? Is there anything else that needs to be addressed?
are the delayed/out of order messages also the reason for the differences between :summarize and grep?