Token doesn't match specifier '%h'
@miraclect This should work, however, please note that goaccess requires a valid IPv4/6, so the second request will be considered invalid as there's a dash in the `%h` field.
goaccess access.log --log-format='%h %^ %^ %e [%d:%t %^] "%r" %s %b "%R" "%u" %D' --date-format=%d/%b/%Y --time-format=%T
Originally posted by @allinurl in https://github.com/allinurl/goaccess/issues/1273#issuecomment-431449237
I'm unsure about your question's specifics. Please don't hesitate to provide additional information as required.
Can I hijack please? Thanks! ;-)
I've tried goaccess, and got various nondescript, and most importantly non-testable errors, usually like:
==362690== Token '2024-02-17_09:06:32.221+0100' doesn't match specifier '%x'
or even less like
==360445== IPv4/6 is required.
==360445==
==360445== Format Errors - Verify your log/date/time format
It seems debugging log-format is almost impossible, or the documentation is not detailed enough. (Granted, they are not in standard common log format.)
Suggestion: is there any way to test format specifiers against patterns?
Suggestion: examples for patterns in the docs. (Most of the strftime patterns are obvious, but, for example, how would you parse this datetme: 2024-02-17_09:06:32.221+0100? I see it doesn't match "specifier %x" which is --datetime-format '%F_%T.%^%z' or its various mutations with %f or like, but I probably have to generate a sample log to test or go to the source to see why it fails.)
The above problem seems to be that the user don't even know which line was not matched, let alone why, or where the parser have failed.
Thanks for the mic. :grin:
@grinapo Please share a few lines from your log so I can review the format.
@allinurl I have tried not to be completely off-topic so my suggestion was to have more meaningful error messages, which would have helped the original issue as well.
But, since you've asked:
hg.grin.hu 158.220.119.92 - - 2024-02-17_09:07:15.790+0100 "GET /poweradmin/file/2f8c29fc5e2e/.hgignore HTTP/1.1" 200 1509 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)" ZdBpMw8kjpqodfnI_5KcOQAAABM
The problem is the milliseconds part which is not in standard strftime and the "junk" specifier seems to match only whole words. awk'ing it is pretty inconvenient for various reasons, apart from being slowed by multiple buffers.
@grinapo I was attempting to locate a similar question, but unfortunately, I couldn't recall the solution. Regardless, it's worth mentioning that strftime doesn't support milliseconds in the C implementation.
Please don't hesitate to suggest how the errors should be handled, or even better, feel free to submit a PR with any changes you believe might improve the experience for users. I'd be more than happy to review them.
The following should do it:
# goaccess access.log --log-format='%v %h %^ %^ %d_%t.%^ "%r" %s %b "%R" "%u" %^' --date-format=%Y-%m-%d --time-format=%T
Closing this. Feel free to reopen it if needed.