apache-log-parser icon indicating copy to clipboard operation
apache-log-parser copied to clipboard

ValueError on ill-formed URLs

Open jw35 opened this issue 7 years ago • 1 comments

v1.7.0 (from pip, though apparently not on GitHub yet!) aborts with

ValueError: invalid literal for int() with base 10: '630:212:8::80:10'

on line 60, in extra_request_from_first_line if it encounters a URL with an ill-formed IPv6 literal, e.g.

"GET http://2001:630:212:8::80:10/authentication/login/ HTTP/1.1"

because it tries to parse all but the first component of the address as port number.

While this request is obviously bogus and doesn't work (the request returned status 400) it's easy enough for people to cause such lines to appear in access logs.

jw35 avatar Jun 22 '17 08:06 jw35

I just ran into this too. I don't see the urlparse call in the github source -- otherwise I'd submit a PR.

Locally I just caught the ValueError and did not annotate the parsed record with the extra URL metadata.

chadaustin avatar Jun 03 '19 05:06 chadaustin