ecma402 icon indicating copy to clipboard operation
ecma402 copied to clipboard

Region Override Support

Open AlexandreGarino opened this issue 6 years ago • 8 comments

Do you plan to support Region Override through Unicode Locale Extension:

Unicode Locale extensions seem to be managed internally.

const custom = new Intl.Locale('en-US-u-rg-gbzzzz-hc-h24');
console.log(custom.toString()); // en-US-u-rg-gbzzzz-hc-h24

The hour cycle and region override extensions are inverted from the toString method.

So, I guess the internal implementation holds all Unicode Locale Extensions but the region override is not take into account to format number, date and so on.

 const locale = new Intl.Locale('en-US-u-rg-gbzzzz');
 console.log(locale.toString()); // OK: en-US-u-rg-gbzzzz
 console.log(locale.language); // OK: en 
 console.log(locale.region); // OK: US
 const date = new Date(Date.UTC(2012, 11, 20, 3, 0, 0));
 console.log('US: ' + new Intl.DateTimeFormat('en-US').format(date)); // OK: 12/20/2012
 console.log('GB: ' + new Intl.DateTimeFormat('en-GB').format(date)); // OK: 20/12/2012
 console.log('GB w/ Region Override: ' + new Intl.DateTimeFormat(locale).format(date)); // KO: 12/20/2012 -> should display 20/12/2012

According to the spec, the region is used as default for:

  • currency
  • calendar
  • week data
  • time cycle
  • measurement system
  • unit preferences
  • number format
  • currency format
  • date/time format

Java 10 supports region override as expected:

import java.text.DateFormat;
import java.text.NumberFormat;
import java.time.LocalDate;
import java.time.Month;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;
import java.util.Currency;
import java.util.Locale;

class Main {
  public static void main(String[] args) {
    
    Locale locale = Locale.forLanguageTag​("en-US-u-rg-gbzzzz");
    System.out.println(locale.getLanguage());
    System.out.println(locale.getCountry());
    
    LocalDate date = LocalDate.of(2012, Month.NOVEMBER, 20);    
    DateTimeFormatter dtf = DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT);
    
    System.out.println("GB: " + date.format(dtf.localizedBy(Locale.forLanguageTag("en-US")))); // OK: 12/20/12
    System.out.println("GB: " + date.format(dtf.localizedBy(Locale.forLanguageTag("en-GB")))); // OK: 20/12/2012
    System.out.println("GB w/ Region Override: " + date.format(dtf.localizedBy(locale))); // OK: 20/12/2012
  }
}

Related issues:

  • #106
  • #257
  • #867 @ GlobalizeJS

AlexandreGarino avatar Aug 18 '19 21:08 AlexandreGarino

@FrankYFTang Is this an ICU issue or a spec issue? Do we need to modify ECMA-402 to allow the -rg- extension, or do we just need to ensure that ICU handles it correctly?

sffc avatar Jan 30 '20 01:01 sffc

We'd need to change the spec to respect this, as the spec includes the schema for locale data, and all supported extension keys are explicitly specified. Region override would need to be processed somehow by the spec to permit this.

littledan avatar May 02 '20 13:05 littledan

We discussed this in the 2020-06-11 ECMA-402 meeting and agreed to move forward.

sffc avatar Jul 08 '20 23:07 sffc

@sffc does this need to be covered under user preferences? I don't think so, perhaps it can be dealt with separately. Question is: PR or proposal? While the spec diff might be larger than many smaller proposals, I don't really think there's much design/decision-making to do here, just make the locale-handling respect this additional subtag, right?

ryzokuken avatar May 28 '21 17:05 ryzokuken

Currently in ECMA402, each Intl object only listen to a restricted set of U extension specified in "[[RelevantExtensionKeys]] internal slot" and all others are stripped out while constructing the object before any matching.

https://tc39.es/ecma402/#sec-internal-slots "[[RelevantExtensionKeys]] is a List of keys of the language tag extensions defined in Unicode Technical Standard 35 that are relevant for the functionality of the constructed objects."

https://tc39.es/ecma402/#sec-intl-collator-internal-slots "10.2.3 Internal Slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1. The value of the [[RelevantExtensionKeys]] internal slot is a List that must include the element "co", may include any or all of the elements "kf" and "kn", and must not include any other elements."

https://tc39.es/ecma402/#sec-intl.datetimeformat-internal-slots "11.3.3 Internal slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1.

The value of the [[RelevantExtensionKeys]] internal slot is « "ca", "hc", "nu" »."

https://tc39.es/ecma402/#sec-Intl.DisplayNames-internal-slots "12.3.3 Internal slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1.

The value of the [[RelevantExtensionKeys]] internal slot is « »."

https://tc39.es/ecma402/#sec-Intl.ListFormat-internal-slots "13.3.3 Internal slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1.

The value of the [[RelevantExtensionKeys]] internal slot is « »."

https://tc39.es/ecma402/#sec-intl.numberformat-internal-slots "15.3.3 Internal slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1.

The value of the [[RelevantExtensionKeys]] internal slot is « "nu" »."

https://tc39.es/ecma402/#sec-intl.pluralrules-internal-slots "16.3.3 Internal slots The value of the [[AvailableLocales]] internal slot is implementation-defined within the constraints described in 9.1.

The value of the [[RelevantExtensionKeys]] internal slot is « »."

So at minimum ECMA402 need to be changed to put "rg" into the RelevantExtensionKeys for it to be considered in the spec if we like to support Region Override.

FrankYFTang avatar Jun 02 '21 18:06 FrankYFTang

Discussion with @FrankYFTang @ben-allen @sffc: in order to move forward with this, the implementation (ICU) needs to fully support the -u-rg subtag, and this is challenged by the fact that there is no clear list of what items fall into the "dialect region" versus the "extension region" bucket. For example:

  • Spelling and pluralization rules: clearly the dialect region
  • Measurement unit preferences: clearly the extension region
  • Grouping separators, datetime patterns: unclear

This is partly tracked upstream in: https://unicode-org.atlassian.net/browse/CLDR-15265

sffc avatar May 02 '23 21:05 sffc

Item impacted by rg is listed in https://github.com/unicode-org/cldr/blob/main/common/supplemental/rgScope.xml

And we should analysis which Intl object should be impacted by the -u-rg-

FrankYFTang avatar Jul 26 '23 22:07 FrankYFTang

I think we should wait until Intl Locale Info lands, and then we should put together this proposal (which should be small to medium in size). ICU4X can make this easier to implement. ICU4C should already implement -u-rg for certain key resources like unit preferences.

sffc avatar May 02 '24 23:05 sffc