alreq icon indicating copy to clipboard operation
alreq copied to clipboard

Add Western Arabic Numerals (ASCII) to A.3 Numeral characters

Open behnam opened this issue 7 years ago • 5 comments

Western Arabic Numerals, encoded as ASCII digits, are the main numerals used in all West Arab regions. Table in Section A.3 Numeral characters is missing them and needs to be fixed.

behnam avatar Oct 31 '17 08:10 behnam

Root cause must be CLDR only recognizing Eastern Arabic Numerals (Arabic-Indic digits) as the number system for the ar locale.

behnam avatar Oct 31 '17 08:10 behnam

I know we have been trying to follow CLDR (or other standards) so far for the list of characters, but i personally see no problem with diverging if we discover that those don't represent reality. (I actually see it as our job to find places where CLDR needs to be corrected.) So i have no problem with adding Western numerals to the table.

r12a avatar Oct 31 '17 11:10 r12a

/cc @brawer

Right. What we've been doing so far has been to file tickets with CLDR for the fixes, and either wait on those to be resolved, or hard-code what we need.

In this case, it's a tricky matter. I have talked to some people about it, and there isn't (and may never be) a locale in CLDR for Western Arabic and Eastern Arabic, to assign these properties to.

Maybe, we should look at Numbering System property for the main locale (ar) and all its sublocales. In this case, this would give use both arab and latn numerals: https://unicode.org/cldr/charts/latest/summary/ar.html

behnam avatar Oct 31 '17 19:10 behnam

Not sure if this helps, but you can put registered number system identifiers into BCP47 extension U. For example, ar-u-nu-arabext is a valid BCP47 language code. See Unicode TR35 for background.

brawer avatar Oct 31 '17 21:10 brawer

https://unicode.org/cldr/charts/latest/summary/ar.html#5 Main: arab Sublocale: latn CLDR data is correct. The way we're consuming it is not.

shervinafshar avatar Nov 01 '17 00:11 shervinafshar