zeroclickinfo-goodies
zeroclickinfo-goodies copied to clipboard
URL Decode: UTF-8 characters are incorrectly decoded
Description
Percent-encoded characters comprising more than one percent sign are decoded by URL Decode as if they were just one character. For example; %C3%9C becomes à but should be Ü.
Steps to recreate
Search for %C3%9C. This becomes à but should be Ü.
Search for %E4%B8%AD%E6%96%87. This becomes 䏿 but should be 中文.
Tested on Firefox 55.0.3 and Safari 10.1.2, on macOS Sierra 10.12.6.
IA Page: http://duck.co/ia/view/urldecode Maintainer: @mintsoft
Adding the discussion label until the potential bug is scoped out.
Basically all this does is uri_unescape($in) and return, so it might be an upstream bug?
It's most likely that the input string isn't being handled as utf8
Not only input string, but also output string :)
if instead of title => $decoded, I write this title => "микрокредит hey", then I'll get this output "микÑокÑÐµÐ´Ð¸Ñ hey". So, it doesn't output utf8 characters.
btw, I've found, how to output correct string, but I don't know, why it works :) I'll commit it if you want.