WiRL icon indicating copy to clipboard operation
WiRL copied to clipboard

Client QueryParam encod error

Open Giottino69 opened this issue 2 years ago • 1 comments

In the Helloworld Echo demo by entering the string "%0AThis is a test!" the encode is not working correctly.

Giottino69 avatar May 03 '22 14:05 Giottino69

I've studied the code and it seems there's quite a bit of fuss on this argument.

I've tried to URLEncode this string with many converters, getting quite different results.

Original string (this is ASCII from 0x20 to 0x7E, with only the first and last alphanumeric values for brevity): !"#$%&'()*+,-./09:;<=>?@AZ[\]^_`az{|}~

This is the output from urlencoder.org and urlencoder.io (same output): %20%21%22%23%24%25%26%27%28%29%2A%2B%2C-.%2F09%3A%3B%3C%3D%3E%3F%40AZ%5B%5C%5D%5E_%60az%7B%7C%7D~ This is the most "complete" encoding I've seen so far, only a few characters are left unencoded (maybe it's even too much).

This is the output from JavaScript's encodeURIComponent() function: %20!%22%23%24%25%26'()*%2B%2C-.%2F09%3A%3B%3C%3D%3E%3F%40AZ%5B%5C%5D%5E_%60az%7B%7C%7D~ This function, as per the docs, encodes characters including those that are part of the URI syntax, as opposed to encodeURI() which doesn't. This set whould be the right choice for encoding QueryParams. BTW, this function also encodes the colon character, which in WiRL 4.5 is used by IsAbsoluteUrl to identify an absolute URL (although it seems an oversimplified approach...).

This is the output from WiRL 4.5's TWiRLURL.URLEncode function: %20!"#$%25%26'()*+,-./09:;<=>?@AZ[\]^_`az{|}~ As you can see, only the space, the % and the & characters are encoded. In particular, the + sign will be interpreted as a space from the server (see this doc), so any parameter containing that character will be broken once received (this was ok in WiRL 4.0).

I've modded the TWiRLURL.URLEncode adding TURLEncoding.TEncodeOption.SpacesAsPlus to TNetEncoding.URL.Encode getting this result: +!"#$%25%26'()*%2B,-./09:;<=>?@AZ[\]^_`az{|}~ Now spaces are converted in + while + are encoded. This fixes the issue with parameters containing the + character, but personally I don't like this approach, it can lead to confusion.

A better approach could be adding the + to the UnsafeChars in TNetEncoding.URL.Encode, this results in: %20!"#$%25%26'()*%2B,-./09:;<=>?@AZ[\]^_`az{|}~ which is better IMO (now both space and + are encoded), but there are still a lot of characters formally not allowed as part of an URL component, which can lead to compatibility problems with the REST of the world (pun intended 😄).

And finally, this was how WiRL 4.0 worked: %20!%22%23$%25%26'()%2A%2B,-./09:;%3C=%3E?@AZ%5B%5C%5D%5E_%60az%7B%7C%7D~ This string is between JS' encodeURI and encodeURIComponent, it's better than the current (WiRL 4.5) output, but still it doesn't encode some of characters that may be part of the URI syntax (JS docs) and shouldn't be present in QueryParams, like the :.

Since the encoding of URL parameters is fundamental for correct interpretation by web/REST servers, the TWiRLURL.URLEncode function should consider all these aspects and be made as compatible as possible.

fperana avatar Mar 22 '24 11:03 fperana

Regarding @Giottino69 issue I think it's fixed now (as version 4.5), regarding @fperana comment is worth a discussion but I don't think it's related to @Giottino69 issue, so I will close the issue inviting a discussion in the Discussions section.

Thanks, Paolo

paolo-rossi avatar May 24 '24 09:05 paolo-rossi