Bug in latest 7.6 and NCSA Log Format: URL-Strings gets truncated after the first blank!
There is a problem with NCSA LogFormat 4 in combination with URLs that have blanks. The URL-Strings gets truncated after the first blank although it is included inside quotes!
Example:
LogFormat=4 (#LogFormat = "%host %other %logname %time1 %methodurl %code %bytesd")
172.30.22.5 - tom.smith [03/Jan/2016:10:39:06 +0100] "GET /Download/__Omnia__Behandlung elektronischer Geschäftsstücke__Ergänzung 2016.pdf HTTP/1.1" 200 96063
It tracks truncated as "Page-URL": /Download/__Omnia__Behandlung
instead correct as File under "Downloads":
/Download/__Omnia__Behandlung elektronischer Geschäftsstücke__Ergänzung 2016.pdf
So all statistics for Page-URL and Downlads counted wrong!
In the meantime I found the reason and a solution:
awstats.pl Line 9025:
$PerlParsingFormat = "([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) ([^ ]+)(?: [^\\\"]+|)\\\" ([\\d|-]+) ([\\d|-]+)";
has to be changed to:
$PerlParsingFormat = "([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) (.+) [^\\\"]+\\\" ([\\d|-]+) ([\\d|-]+)";
Would be appropriate to be fixed in the next version!
@BMEIA It seems with github rendering code, some \ are lost. To be sure, can you send me your awstats.pl file after the change to [email protected] ?
Sorry, that´s true. I corrected it in my post above and sent you in addition the awstats.pl file via mail!