icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Missing `other` Unit Pattern for `sd` Locale in the Data for Long Currency

Open younies opened this issue 1 year ago • 5 comments
trafficstars

According to Unicode Technical Standard #35, the other unit pattern must be present in the data.

However, in this PR: https://github.com/unicode-org/icu4x/pull/5351, the data was missing for the sd locale. See this commit: https://github.com/unicode-org/icu4x/pull/5351/commits/e85e959e8675abbbfae200013b6ff6390e00e724.

What shall we do in this case?

younies avatar Aug 15 '24 10:08 younies

and Other is always exist, it is better to have an external type other_patter/default_pattern : Cow<'data, str> , so, in the code we do not need to trust unwrap

younies avatar Aug 15 '24 13:08 younies

Can you locate this missing value in CLDR-JSON?

robertbastian avatar Aug 15 '24 14:08 robertbastian

@zbraniecki posted the following in chat:

Younis - can you locate the correct group in https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-units-modern/main/en/units.json ? I'm comparing it to sd locale data is it this - https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-numbers-modern/main/en-001/numbers.json#L168-L170 ? but this one has other in sd - https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-numbers-modern/main/sd/numbers.json#L263-L265

sffc avatar Aug 15 '24 19:08 sffc

That^ PR is not related to this. It does not remove the sd locale, because there is data for it.

robertbastian avatar Aug 20 '24 11:08 robertbastian

UTS 35 says:

The numberSystem attribute is used to specify that the given number formatting pattern(s) are to be used when the given numbering system is active. By default, number formatting patterns without a specific numberSystem attribute are assumed to be used for the "latn" numbering system, which is western (ASCII) digits. Locales that specify a numbering system other than "latn" as the default should also specify number formatting patterns that are appropriate for use within the context of the given numbering system. For more information on numbering systems and their definitions, see Section 1: Numbering Systems.

We confirmed that in the sd locale, there is a pattern in latn but not arab, but the default numbering system is arab.

There should be a CLDR issue filed about this.

sffc avatar Aug 20 '24 15:08 sffc