urlify icon indicating copy to clipboard operation
urlify copied to clipboard

Support more characters by default

Open motin opened this issue 8 years ago • 1 comments

Had to add the following chars for our transliteration test to pass:

        URLify::add_chars(
            array(
                'Ÿ' => 'Y',
                'µ' => 'u',
                '¥' => 'Y',
                'Ĉ' => 'C',
                'ĉ' => 'c',
                'Ċ' => 'C',
                'ċ' => 'c',
                'Ĝ' => 'G',
                'ĝ' => 'g',
                'Ġ' => 'G',
                'ġ' => 'g',
                'Ĥ' => 'H',
                'ĥ' => 'h',
                'Ħ' => 'H',
                'ħ' => 'h',
                'Ĕ' => 'E',
                'ĕ' => 'e',
                'Ĭ' => 'I',
                'ĭ' => 'i',
                'Ĵ' => 'J',
                'ĵ' => 'j',
                'Ĺ' => 'L',
                'ĺ' => 'l',
                'Ľ' => 'L',
                'ľ' => 'l',
                'Ŀ' => 'L',
                'ŀ' => 'l',
                'ʼn' => 'n',
                'Ō' => 'O',
                'ō' => 'o',
                'Ŏ' => 'O',
                'ŏ' => 'o',
                'Ŕ' => 'R',
                'ŕ' => 'r',
                'Ŗ' => 'R',
                'ŗ' => 'r',
                'Ŝ' => 'S',
                'ŝ' => 's',
                'Ŧ' => 'T',
                'ŧ' => 't',
                'Ŭ' => 'U',
                'ŭ' => 'u',
                'Ŵ' => 'W',
                'ŵ' => 'w',
                'Ŷ' => 'Y',
                'ŷ' => 'y',
                'ſ' => 'i',
                'ƒ' => 'f',
                'O' => 'O',
                'o' => 'o',
                'U' => 'U',
                'u' => 'u',
                'Ǎ' => 'A',
                'ǎ' => 'a',
                'Ǐ' => 'I',
                'ǐ' => 'i',
                'Ǒ' => 'O',
                'ǒ' => 'o',
                'Ǔ' => 'U',
                'ǔ' => 'u',
                'Ǖ' => 'U',
                'ǖ' => 'u',
                'Ǘ' => 'U',
                'ǘ' => 'u',
                'Ǚ' => 'U',
                'ǚ' => 'u',
                'Ǜ' => 'U',
                'ǜ' => 'u',
                'Ǻ' => 'A',
                'ǻ' => 'a',
                'Ǿ' => 'O',
                'ǿ' => 'o',
                'Ǽ' => 'Ae',
                'ǽ' => 'ae',
                'IJ' => 'IJ',
                'ij' => 'ij',
                'J' => 'J',
                'ĸ' => 'k',
                'Ŋ' => 'N',
                'ŋ' => 'n',
                'Ẁ' => 'W',
                'ẁ' => 'w',
                'Ẃ' => 'W',
                'ẃ' => 'w',
                'Ẅ' => 'W',
                'ẅ' => 'w',
            )
        );

Unfortunately, since I do not know what language they belong to, I find it difficult to provide a PR when the code is structured based on language.

motin avatar Jul 18 '16 07:07 motin

I think adding an 'other' or 'unknown' option to the list would be a fine way to include these. The 'latin' and 'latin_symbols' languages aren't proper language codes anyway :)

jbroadway avatar Jul 21 '16 18:07 jbroadway