zeroclickinfo-goodies icon indicating copy to clipboard operation
zeroclickinfo-goodies copied to clipboard

URL Decode: UTF-8 characters are incorrectly decoded

Open jc86035 opened this issue 8 years ago • 4 comments

Description

Percent-encoded characters comprising more than one percent sign are decoded by URL Decode as if they were just one character. For example; %C3%9C becomes Ü but should be Ü.

Steps to recreate

Search for %C3%9C. This becomes Ü but should be Ü. Search for %E4%B8%AD%E6%96%87. This becomes 中文 but should be 中文.

Tested on Firefox 55.0.3 and Safari 10.1.2, on macOS Sierra 10.12.6.


IA Page: http://duck.co/ia/view/urldecode Maintainer: @mintsoft

jc86035 avatar Sep 01 '17 10:09 jc86035

Adding the discussion label until the potential bug is scoped out.

pjhampton avatar Sep 01 '17 11:09 pjhampton

Basically all this does is uri_unescape($in) and return, so it might be an upstream bug?

jc86035 avatar Sep 17 '17 11:09 jc86035

It's most likely that the input string isn't being handled as utf8

mintsoft avatar Sep 17 '17 12:09 mintsoft

Not only input string, but also output string :) if instead of title => $decoded, I write this title => "микрокредит hey", then I'll get this output "микÑокÑÐµÐ´Ð¸Ñ hey". So, it doesn't output utf8 characters.

btw, I've found, how to output correct string, but I don't know, why it works :) I'll commit it if you want.

hektr avatar Sep 22 '17 17:09 hektr