goaccess icon indicating copy to clipboard operation
goaccess copied to clipboard

CLI vs Docker difference. --datetime-format and --log-format and timezone

Open wphampton opened this issue 2 years ago • 1 comments

Thanks for this excellent tool -- I've almost got it just right.

I'm noticing a difference between using the installed version 1.7 vs the latest Docker image.

Using this dummy access_log

192.168.1.101 - - [16/Dec/2022:15:11:00 +0000] "GET /app1/ HTTP/1.1" 200 1024 "-" "-"
192.168.1.102 - - [16/Dec/2022:16:14:00 +0000] "GET /app2/ HTTP/1.1" 200 1024 "-" "-"
192.168.1.103 - - [16/Dec/2022:16:21:00 +0000] "GET /app1/ HTTP/1.1" 200 1024 "-" "-"

When I type this command, I get the proper HTML results I am expecting. goaccess access_log -a -o report.html --log-format='%h %^ %e [%x] "%r" %s %b "%R" "%u"' --datetime-format='%d/%b/%Y:%H:%M:%S %z' --tz=America/New_York --no-ip-validation

However, when I type this command... cat access_log | docker run --rm -i -e LANG=$LANG -e TZ="America/New_York" allinurl/goaccess:latest -a -o html --datetime-format='%d/%b/%Y:%H:%M:%S %z' --tz=America/New_York --no-ip-validation --log-format='%h %^ %e [%x] "%r" %s %b "%R" "%u"' - > report.html

I receive the following error:

 [PARSING -] {3} @ {0/s}
==1== GoAccess - version 1.7 - Jan  2 2023 19:06:03
==1== Config file: /etc/goaccess/goaccess.conf
==1== https://goaccess.io - <[email protected]>
==1== Released under the MIT License.
==1==
==1== FILE: -
==1== Parsed 3 lines producing the following errors:
==1==
==1== Token '16/Dec/2022:15:11:00 +0000' doesn't match specifier '%x'
==1== Token '16/Dec/2022:16:14:00 +0000' doesn't match specifier '%x'
==1== Token '16/Dec/2022:16:21:00 +0000' doesn't match specifier '%x'
==1==
==1== Format Errors - Verify your log/date/time format

I've not been able to resolve this and am wondering if there is a bug with the Docker version, or perhaps I am supplying options incorrectly. Thanks so much again!

Wes

wphampton avatar Jan 16 '23 19:01 wphampton

Sorry for the delay on my response.

It may be giving priority to date-format and time-format variables instead --datetime-format in /etc/goaccess/goaccess.conf. Could you please try passing --no-global-config or getting rid of those two variables from the config file and see if that does it? Let me know. Thanks

allinurl avatar Jan 25 '23 16:01 allinurl

I'm sorry for just writing back, but I'm back at trying to utilize goaccess through the Docker container. --no-global-config does not change the behavior. I am still getting the same error as above.

In this thread there is some conversation about locale and such. Any suggestion on what to try next to troubleshoot this on the goaccess docker, alpine-based image?

Using this format within Docker --log-format='%h %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" %^ \"%v\" %^ %Lms' --time-format='%H:%M:%S' --date-format='%d/%b/%Y' --tz=America/New_York does work and does not throw errors. It seems to work, but what do I sacrifice by using this method?

A likely separate issue which I can open is that it seems that goaccess is attributing the the log entries to the previous day instead of the current day. So, -1 day.

Thanks for this great project! -Wes

wphampton avatar Mar 15 '24 16:03 wphampton

As you mentioned, it appears to be a locale-related matter. Could you please check the locale of your machine, for example, by running locale -a? Your logs seem to display English dates, so you might want to consider setting the LC_TIME variable to "en_US.UTF-8".

allinurl avatar Mar 20 '24 22:03 allinurl

When I run locale -a on the allinurl/goaccess image I receive the error sh: locale: not found.

wphampton avatar Mar 21 '24 17:03 wphampton

Did execute the locale command from within the container's shell? you can also try: echo $LANG. You can also try putting in your Dockerfile:

ENV LANG en_US.UTF-8  
ENV LANGUAGE en_US:en  
ENV LC_ALL en_US.UTF-8  

or just LC_TIME

allinurl avatar Mar 21 '24 17:03 allinurl

also, please try within the container's shell:

$ date '+%d/%b/%Y:%H:%M:%S %z'

allinurl avatar Mar 21 '24 17:03 allinurl

Yep, from within the running container running the image allinurl/goaccess when I open a shell and type locale -a I get the error sh: locale: not found.

When instead I execute date '+%d/%b/%Y:%H:%M:%S %z' then I get back 21/Mar/2024:13:25:42 -0400 image

To my compose file setup I added

LANG: en_US.UTF-8
LANGUAGE: en_US.UTF-8
LC_TIME: en_US.UTF-8
LC_ALL: en_US.UTF-8
 [SETTING UP STORAGE /var/log/access_logs/localtime.log] {0} @ {0/s}
==1== GoAccess - version 1.9.1 - Feb  6 2024 03:28:03
==1== Config file: No config file used
==1== https://goaccess.io - <[email protected]>
==1== Released under the MIT License.
==1==
==1== FILE: /var/log/access_logs/localtime.log
Cleaning up resources...
==1== Parsed 10 lines producing the following errors:
==1==
==1== Token '21/Mar/2024:12:10:43 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:43 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:43 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:43 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1== Token '21/Mar/2024:12:10:44 -0400' doesn't match specifier '%x'
==1==
==1== Format Errors - Verify your log/date/time format```

wphampton avatar Mar 21 '24 18:03 wphampton

Just give me a heads up if this provides any help, it seems to resemble the issue.

allinurl avatar Mar 21 '24 18:03 allinurl

Thanks, but if I'm using the docker image from this project, shouldn't it have the correct strptime/strftime version to support what is needed? I don't need to install or configure anything within the allinurl/goaccess docker image, right?

If I use web logs that are already in my local time then this works:

--log-format='%h %^ %^ [%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" %^ \"%v\" %^ %Lms'
--time-format='%H:%M:%S'
--date-format='%d/%b/%Y'

However, if I use logs which are in the default UTC time I cannot adjust for the timezone.

wphampton avatar Mar 21 '24 18:03 wphampton

I'm thinking perhaps the busybox OS which the allinurl/goaccess Docker image is based on has a more minimal version of date which can output the TZ offset %z but does not understand it as an input.

https://unix.stackexchange.com/a/494586/449962

wphampton avatar Mar 22 '24 17:03 wphampton

For example on RHEL 9:

date -d "Mar 22, 2024" +"%Y-%m-%d"
2024-03-22

But on busybox:musl on which the allinurl/goaccess Docker image is built:

date -d "Mar 22, 2024" +"%Y-%m-%d"
date: invalid date 'Mar 22, 2024'

wphampton avatar Mar 22 '24 17:03 wphampton

Thanks for sharing that. Yeah, it seems like musl doesn't support %z. I'll dig into this a bit more and let you know what I find.

allinurl avatar Mar 22 '24 23:03 allinurl

I think it might be a busybox thing. I tried the musl, glibc and uclibc image variants (verify my work of course) and they all responded like this:

# date -d "Mar 25, 2024" +"%Y-%m-%d"
date: invalid date 'Mar 25, 2024'

wphampton avatar Mar 25 '24 13:03 wphampton

Give the latest image a shot, I believe it should do the job. Let me know how it goes. Thanks.

allinurl avatar Mar 26 '24 02:03 allinurl

Nice work, I believe that did it! Just for the wrap up, I am reading a log file which is in UTC time generated by Traefik on Docker Swarm:

192.168.1.100 - - [21/Mar/2024:16:08:43 +0000] "GET /index.html HTTP/2.0" 200 1484 "-" "-" 829759 "some-docker-service-router@docker" "http://10.0.3.100:8080" 129ms

And this is my compose file entry:

    command: >
      /path/to/access_logs/utctime.log
      -o /path/to/output/index.html
      --real-time-html
      --origin=https://sub.example.com
      --port=7890
      --ws-url=wss://sub.example.com:443/www-stats-ws
      --log-format='%h %^ %^ [%x] \"%r\" %s %b \"%R\" \"%u\" %^ \"%v\" %^ %Lms'
      --datetime-format='%d/%b/%Y:%H:%M:%S %z'
      --date-spec=min
      --tz=America/New_York

Thanks so much for this tool and this most recent fix! -Wes

wphampton avatar Mar 26 '24 14:03 wphampton

One item I noticed -- which I don't know if it is intended or not -- is that if I have a log formatted in my local time already (not UTC as shown above) and I keep the --tz=America/New_York I get some real weird results.

For example a log entry with a timestamp of 21/Mar/2024:12:39:50 -0400 gets rendered into GoAccess as 07/Apr/2024 and the time of 00:39. If I remove the --tz=America/New_York everything renders as expected.

So, all is well, but I thought I'd mention including the time zone could hurt if your logs are already in local time.

wphampton avatar Mar 26 '24 14:03 wphampton

Thanks for letting me know. I've made some changes to ensure that the offset is extracted and calculated correctly. Feel free to give it a try and let me know what you think.

allinurl avatar Mar 26 '24 22:03 allinurl

Perfect! Including the timezone via --tz=America/New_York when parsing a log with times already in local time no longer has an adverse effect. A huge thank you to you, sir!

wphampton avatar Mar 27 '24 13:03 wphampton

No, thank you for reporting this issue! Please feel free to reopen it if necessary.

allinurl avatar Mar 27 '24 13:03 allinurl