racket-http-easy icon indicating copy to clipboard operation
racket-http-easy copied to clipboard

%2A is force decoded to *

Open cloudrac3r opened this issue 2 years ago • 7 comments
trafficstars

A particularly picky website requires me to send a GET request with %2A, however the http-easy internals force this to be decoded to * via the (->url urlish) and (url-path&query u params*) round-trip. The website does not accept * and sends a 301 redirect with * replaced with %2A. This means http-easy and the website are now in an infinite loop where http-easy changes to * and the website asks to change it back to %2A. It only leaves the loop due to #:max-redirects.

I think that if no #:params are required then http-easy should keep the provided URL without doing the round-trip conversion, to make sure each byte stays exactly the same as provided.

Since this is such an edge-case, if it is difficult for you to implement then I would be happy with suggestions for a workaround. I tried looking through your code to see if I could come up with a PR, but I figured you would know your own utility functions better than I do.

cloudrac3r avatar Mar 20 '23 09:03 cloudrac3r

Does it work if you encode the %? Eg. example.com?param=%252A?

Bogdanp avatar Mar 28 '23 09:03 Bogdanp

Yeah, that actually does work, haha! I will see if I can write this workaround into my code

cloudrac3r avatar Mar 28 '23 10:03 cloudrac3r

Digging in a little bit, this appears to be a bug in net/uri-codec (unless I'm missing something):

> (require net/uri-codec)
> (alist->form-urlencoded '((param . "*")))
"param=*"
> (alist->form-urlencoded '((param . "%")))
"param=%25"

RFC3986 states that * is a reserved char so it should be encoded.

Bogdanp avatar Mar 28 '23 10:03 Bogdanp

* is not in the x-www-form-urlencoded set, so it doesn't need to be encoded in query strings according to the spec. Of course, server behaviour may differ. https://url.spec.whatwg.org/#application-x-www-form-urlencoded-percent-encode-set

cloudrac3r avatar Mar 28 '23 10:03 cloudrac3r

Yes, you're right. I was confused because I was comparing alist->form-urlencoded to Python's urlencode, which does encode *. Re. the overall issue, probably it would be better not to round-trip redirect URLs, as you suggest, but that seems like it'll be a bit painful. I'll look into it more later this week.

Bogdanp avatar Mar 28 '23 11:03 Bogdanp

If I can get %252A working in my real code, there's probably no need for you to work on this - this is such an edge case of an edge case!

cloudrac3r avatar Mar 28 '23 11:03 cloudrac3r

A related issue is that %20 is also force-decoded to + when it appears after ? in the URL (i.e. it is part of the query parameters). While the URL spec says you're supposed to encode with +, there's a shocking number of websites out there that rely on %20 instead.

The %2520 trick doesn't work in this case, because it remains at %2520. I cannot find a workaround that would allow me to send %20 directly.

cloudrac3r avatar May 20 '23 13:05 cloudrac3r