aws-log-replay icon indicating copy to clipboard operation
aws-log-replay copied to clipboard

"generatepath lb" slices timestamps with colon

Open mapsam opened this issue 8 years ago • 1 comments

When running generatepath lb on a URL with a number of :'s, everything beyond the second-to-last semicolon is removed from the URL, resulting in unexpected URLs returned.

Example URL input (from within a CloudFront log file)

GET http://api.example.com:80/v4/mapsam.semicolon-test/2/3/1.vector.pbf?access_token=pk.token&style=mapbox://styles/mapsam/1234@2017-05-31T22:05:32.562Z

Results in the following output

/v4/mapsam.semicolon-test/2/3/1.vector.pbf?access_token=pk.token&style=mapbox://styles/mapsam/1234@2017-05-31T22

It appears it's just related to time stamps. The following URL with a number of semi colons works just fine:

Input

GET http://api.example.com:80/v1/thing/my.id?access_token=hello:world&another=thing:with:a:semicolon/and/some//slashes:waka

Output

/v1/thing/my.id?access_token=hello:world&another=thing:with:a:semicolon/and/some//slashes:waka

But adding just a timestamp results in it getting cut off:

Input

GET http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z

Output

/v1/thing/my.id?time=2017-05-31T22

cc @jakepruitt @springmeyer

mapsam avatar Aug 29 '17 20:08 mapsam

Narrowed it down to this path.split() regex. The following example log line:

http - - - - - - - - - - - "GET http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z HTTP/1.1" "Amazon CloudFront" - - - "root"

results in the following path

# path before split 
http://api.example.com:80/v1/thing/my.id?time=2017-05-31T22:05:32.562Z

# path after split 
/v1/thing/my.id?time=2017-05-31T22

mapsam avatar Aug 29 '17 21:08 mapsam