luasocket icon indicating copy to clipboard operation
luasocket copied to clipboard

build_path doesn't create valid urls when a scheme is received

Open andrewstarks opened this issue 3 years ago • 2 comments

I'm not 100% sure this is a bug, but consider:


local url = require"socket.url"
local url_path = "http://www.example.com/cgilua/index.lua?a=2#there"
local url_parsed_path = url.parse_path(url_path)
--I expect this to be true but it isn't...
assert(url.unescape(url.build_path(url_parsed_path, true)) == false)
print(url.unescape(url.build_path(url_parsed_path, true))))
--> "http:/www.example.com/cgilua/index.lua?a=2#there"

Note that the scheme is followed by :/ and not ://. When receiving the array of URL parts, it correctly parses http: and leaves off the slashes. Given that a path can't have : in it (is that correct?) can't it be assumed that 2 slashes (//) should follow?

andrewstarks avatar Dec 17 '20 20:12 andrewstarks

See RFC 3986. Colons are legal in URI paths, except in the first path segment if the path is relative. The resulting URL is definitely wrong if it only has one slash after the colon, though. Also, just offhand, I'd probably expect the scheme's colon to be removed when parsing and then replaced when rebuilding. So perhaps a bug exists in parse_path() and/or build_path().

ewestbrook avatar Dec 17 '20 22:12 ewestbrook

Hmm... does build_path necessarily know that the colon in the first position is first, or if perhaps it's an edge case where it just happened to be first and it's a segment of a relative path?

I think you're correct about parse_path. Perhaps it needs to be augmented to carry that info? Maybe a flag, such as absolute = true/false? or scheme? I'm not sure how to fix this, but I think that being able to round trip between build_path and parse_path would be good. I'll have to think on it...

andrewstarks avatar Dec 18 '20 21:12 andrewstarks