aws-log-replay
aws-log-replay copied to clipboard
"generatepath lb" slices timestamps with colon
When running generatepath lb on a URL with a number of :'s, everything beyond the second-to-last semicolon is removed from the URL, resulting in unexpected URLs returned.
Example URL input (from within a CloudFront log file)
GET http://api.example.com:80/v4/mapsam.semicolon-test/2/3/1.vector.pbf?access_token=pk.token&style=mapbox://styles/mapsam/1234@2017-05-31T22:05:32.562Z
Results in the following output
/v4/mapsam.semicolon-test/2/3/1.vector.pbf?access_token=pk.token&style=mapbox://styles/mapsam/1234@2017-05-31T22
It appears it's just related to time stamps. The following URL with a number of semi colons works just fine:
Input
GET http://api.example.com:80/v1/thing/my.id?access_token=hello:world&another=thing:with:a:semicolon/and/some//slashes:waka
Output
/v1/thing/my.id?access_token=hello:world&another=thing:with:a:semicolon/and/some//slashes:waka
But adding just a timestamp results in it getting cut off:
Input
GET http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z
Output
/v1/thing/my.id?time=2017-05-31T22
cc @jakepruitt @springmeyer
Narrowed it down to this path.split() regex. The following example log line:
http - - - - - - - - - - - "GET http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z HTTP/1.1" "Amazon CloudFront" - - - "root"
results in the following path
# path before split
http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z
# path after split
/v1/thing/my.id?time=2017-05-31T22