humanize
humanize copied to clipboard
`intword` uses English units regardless of the active localization
Not all languages group large numbers into A103 + B106, etc. For example, Japanese has no word for 106. Instead, it has a word for 104, and 106 is written as 100 of those. Unsurprisingly, 2e7 is not written 20 × 100 × 104, but as 2000 × 104. Translations should be able to specify which powers of 10 have special names.
What did you do?
>>> import humanize
>>> humanize.i18n.activate("ja_JP")
<gettext.GNUTranslations object at 0x102fcca00>
>>> humanize.intword(234909023)
'234.9 百万'
>>> humanize.intword(2349090)
'2.3 百万'
What did you expect to happen?
>>> humanize.intword(234909023)
'2.3億'
>>> humanize.intword(2349090)
'234.9万'
What actually happened?
>>> humanize.intword(234909023)
'234.9 百万'
>>> humanize.intword(2349090)
'2.3 百万'
(This is the equivalent of putting in 23490902 and getting "234.9 hundred thousand" in English)
What versions are you using?
- OS: Fedora 39
- Python: 3.12
- Humanize: 4.9.0
Thanks for the report. I'm not sure how well suited this library is to adapt for this, but would review a PR if you'd like to look into it.
Another point: in French, values under 2 millions (or other units) should not be pluralized. Examples: 1 million, 1.1 million, 1.7 milliard.
I thought about kludging something using a translation* but this would not be a generic solution to also solve this bug. So I think both the string values and the number defining brackets would need to be part of the translation files.
* (like plural_threshold = int(pgettext("intword plural threshold", "1")))