HTTP.jl icon indicating copy to clipboard operation
HTTP.jl copied to clipboard

Add function for encoding with `application/x-www-form-urlencoded` and use it internally

Open iamed2 opened this issue 1 year ago • 2 comments
trafficstars

The HTML spec (at least 4 and 5, which I have linked) requires that be encoded as + in application/x-www-form-urlencoded. Python has a function for this urllib.parse.quote_plus, HTTP.jl should probably also have one, and use it for encoding application/x-www-form-urlencoded.

HTML 4: https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 HTML 5: https://url.spec.whatwg.org/#urlencoded-serializing (the "true" argument is spaceAsPlus=true)

iamed2 avatar Dec 21 '23 17:12 iamed2

We currently have:

# application/x-www-form-urlencoded
    return write(stream, URIs.escapeuri(body))

are you aware of that? or are there issues with not conforming to the spec quite right for that?

quinnj avatar Dec 24 '23 04:12 quinnj

Yes, this is actually incorrect. If you have:

post_params = Dict("p1" => "something with a space", "p2" => "yes no")

Then,

URIs.escapeuri(post_params)

will give you

"p2=yes%20no&p1=something%20with%20a%20space

Which is incorrect. It should actually be:

"p2=yes+no&p1=something+with+a+space"

The result is then entirely dependant on the resiliency and the smartness of the server at the other end receiving your incorrect encoding. You don't want to rely on that.

A more correct solution would be at least to do:

replace(URIs.escapeuri(post_params), "%20" => "+")

This is the same problem in reverse that I pointed out in #1118 : you cannot just reuse the exact same querystring encoding/decoding logic for application/x-www-form-urlencoded and expect it to work. They are actually different. This will lead to silent errors and hard-to-debug problems.

nguiard avatar Mar 31 '24 09:03 nguiard