PHP-I18N Support explicit encodings other than UTF-8

ISO-8859-1
- Afrikaans
- Albanian
- Basque
- Breton
- Catalan
- Cornish
- Danish
- Dutch
- English
- Estonian
- Faroese
- Finnish
- French
- Galician
- German
- Greenlandic
- Icelandic
- Indonesian
- Irish
- Italian
- Malay
- Manx
- Norwegian
- Occitan
- Portuguese
- Spanish
- Swedish
- Tagalog
- Uzbek
- Walloon
ISO-8859-2
- Bosnian
- Croatian
- Czech
- Hungarian
- Polish
- Romanian
- Serbian
- Slovak
- Slovenian
ISO-8859-3
- Maltese
ISO-8859-5
- Macedonian
- Serbian
ISO-8859-6
- Arabic
ISO-8859-7
- Greek
ISO-8859-8
- Hebrew
ISO-8859-9
- Turkish
ISO-8859-13
- Latvian
- Lithuanian
- Maori
ISO-8859-14
- Welsh
ISO-8859-15
- Basque
- Catalan
- Dutch
- English
- Finnish
- French
- Galician
- German
- Irish
- Italian
- Portuguese
- Spanish
- Swedish
- Walloon
KOI8-R
- Russian
KOI8-U
- Ukrainian
KOI8-T
- Tajik
CP1251
- Bulgarian
- Belarusian
GB2312 / GBK / GB18030
- Chinese (Simplified)
BIG5 / BIG5-HKSCS
- Chinese (Traditional)
EUC-JP
- Japanese
EUC-KR
- Korean
TIS-620
- Thai
GEORGIAN-PS
- Georgian

Source: https://www.gnu.org/software/gettext/manual/html_node/Header-Entry.html

Mar 20 '19 22:03 ocram

Can I work on this issue?

Jun 11 '19 20:06 LS05

Do you know what needs to be done?

In general, we like to talk and discuss concepts and details before building the implementation, and also during the process, to avoid implementations that go in the wrong direction or miss critical details.

Jun 12 '19 15:06 ocram

Hey @ocram I will first understand what needs to be done, and then come up with a plan in the coming days!

Jun 17 '19 22:06 LS05

I have searched for "UTF8" in the repository and this shell script i18n.sh and this class I18n have a hardcoded dependency on UTF8.

Am I missing something else? Am I on the right track?

Jun 25 '19 21:06 LS05

You’re right, in both of these files, some changes and generalization are necessary.

The shell script would need to accept an optional encoding supplied as a parameter, in a format that works for the Gettext utilities.

In the PHP code, we don’t want to try all those possible encodings for every locale that is specified, so it should definitely first detect which encodings are relevant for each specific locale in question (see the list above). The detected set of relevant encodings should be tried only after UTF-8.

Finally, whatever encoding is used, this must be compatible with PHP’s current internal encoding (and the encoding of any output), whichever those may be.

By the way, I’m not sure how urgent all this is, because most applications should be using UTF-8 now, especially newer ones.

Jun 26 '19 18:06 ocram

PHP-I18N PHP-I18N copied to clipboard

Support explicit encodings other than UTF-8

PHP-I18N
PHP-I18N copied to clipboard