Plural formas parser for pluralization
Proposed Solution
- Leave default implementation as is
- Allow plugging in a custom pluralization module (already possible)
- Extend
Gettext.Plural- The plural forms header should be passed to both
plurals&nplurals
- The plural forms header should be passed to both
- Based on our experiences going forward, this behavior could also be made the default if people agree to do so.
Problem description
gettext currently does not parse the Plural-Forms header in .po / .mo files. Instead, it provides an elixir version of default pluralizations. This has a few shortcomings:
Missing Locales
There are 183 ISO3166 languages, while gettext provides 144.
Mismatch Plural Forms Header / Elixir default
Translation tools, like POEDIT, Crowdin, Transfex, all support and can output of plural-forms headers. If Elixir doesn’t support those headers then it's possible this is a mismatch between the translator's intent and the result. See
- https://docs.transifex.com/formats/gettext
- https://github.com/vslavik/poedit/blob/master/src/language_impl_plurals.h
- https://store.crowdin.com/gnu-gettext/
XLIFF
IFF is the premium interchange solution for translators and translation tooling and it also embeds plural-rule headers
https://docs.oasis-open.org/xliff/v1.2/xliff-profile-po-1.2-pr-02-20061016-DIFF.pdf
Primitive Fallback
Gettext has extremely primitive locale fallback mechanisms. It has no proper support for BCP47 language tags. In such a case, its ability to resolve the correct Gettext locale is typically just text equality matching. A totally valid locale of en_Latn_US wouldn’t ever match a gettext locale of en. In which case, how would a user specify plural forms if they can’t provide a plural-forms header themselves?
Thanks to @kipcole9 for providing me with detailed arguments in favor of this feature.
Dependencies / Performance
As discussed with @josevalim via Slack, we do not want to introduce new dependencies into the gettext library other than expo. This could be solved either by using nimble_parsec.compile and removing nimble_parsec as a dependency or to switch to a yacc based approach as well.
Relates to
- https://github.com/elixir-gettext/gettext/pull/313
- https://github.com/elixir-gettext/gettext/pull/306
- https://github.com/elixir-gettext/expo/pull/64
- https://github.com/elixir-gettext/expo_plural_forms
Any extension to Gettext.Plural to support the described use cases are welcome.
@josevalim I‘ll do that once I have time 😊
@maennchen this would be what https://github.com/elixir-gettext/expo_plural_forms is?
@whatyouhide Yes and no:
The repo is able to parse the headers and accurately choose the correct plural form based on that.
It is however not a complete solution yet to everything described in the issue nor is it currentöy possible to use this with gettext.
@maennchen why is it not possible to use it with Gettext?
It is however not a complete solution yet to everything described in the issue
The issue is too broad IMO. For example, the locale fallback doesn't belong together with the parsing of Plural-Forms header IMO, so I think we can split those up.
@whatyouhide The current plural behavior exposes only the locale as an argument. To use the plural form parser, it would need the plural forms header as well.
Agreed on the issue beeing to broad, we should separate plural form detection from the issue of choosing the correct translation depending on the users given locales.