apache-log-parser
apache-log-parser copied to clipboard
ValueError on ill-formed URLs
v1.7.0 (from pip, though apparently not on GitHub yet!) aborts with
ValueError: invalid literal for int() with base 10: '630:212:8::80:10'
on line 60, in extra_request_from_first_line if it encounters a URL with an ill-formed IPv6 literal, e.g.
"GET http://2001:630:212:8::80:10/authentication/login/ HTTP/1.1"
because it tries to parse all but the first component of the address as port number.
While this request is obviously bogus and doesn't work (the request returned status 400) it's easy enough for people to cause such lines to appear in access logs.
I just ran into this too. I don't see the urlparse call in the github source -- otherwise I'd submit a PR.
Locally I just caught the ValueError and did not annotate the parsed record with the extra URL metadata.