strlib icon indicating copy to clipboard operation
strlib copied to clipboard

Russian symbols for strurlencode

Open NexiusTailer opened this issue 5 years ago • 4 comments

This is a fix that was mentioned in #18, but with all necessary improvements

NexiusTailer avatar Mar 16 '20 23:03 NexiusTailer

0x255 is 597, I thought this was for an 8-bit character set?

Which character set are used by Russians on Windows? Windows-1251?

oscar-broman avatar Mar 19 '20 16:03 oscar-broman

Now it seems I've made it right and tested.

Which character set are used by Russians on Windows? Windows-1251?

Yes, it was my mistake of choosing wrong symbols table

NexiusTailer avatar Mar 20 '20 23:03 NexiusTailer

I think it's ready to merge. I've tested it again for sure and it works correctly.

NexiusTailer avatar Apr 07 '20 11:04 NexiusTailer

One last problem, is that this code is very specific to the russian character set. Not all servers use Windows-1251 so it should not be hardcoded into this.

Looking quickly at RFC-3986, I think the only characters that need encoding are the following: % ! * ' ( ) ; : @ & = + $ , / ? # [ ]

So it's probably better to make it encode only those. That way, the code does not favour a specific character set and should produce even better URIs (better meaning less %xx).

oscar-broman avatar Apr 07 '20 17:04 oscar-broman