ecma402
ecma402 copied to clipboard
Complete the extension keys + options + resolvedOptions set
This is the first part of the spin-off from #68.
Currently, there are two ways of providing parameters into Intl APIs: via extension keys on locale tag and via options.
There is also one way to test which values has been assigned to the options: resolvedOptions.
The issue is that the list of extension keys only partially overlap with the list of options. Sometimes there is an option, but we ignore the available extension key, sometimes we don't have an option and only accept the param via extension key, sometimes we allow both.
In general, my recommendation is to get to the point where all params that are available through extension keys were also available through options, and options would take precedence.
Here's the break out per API:
Collator
- Extension Keys:
- co
- kn
- kf
- Options:
- numeric
- caseFirst
- (new) collation
- resolvedOptions:
- collation
- numeric
- caseFirst
This one is the simplest. We already allow for kn and kf and their counterparts in options. The only missing option is collation.
NumberFormat
- Extension Keys:
- nu
- (new) cu
- Options:
- currency
- (new) numberingSystem
- resolvedOptions:
- numberingSystem
- currency
Here we have extension keys supporting nu but not cu and options that support the reverse.
Adding cu to extension keys and numberingSystem to options gives us complete coverage.
DateTimeFormat
- Extension Keys:
- nu
- ca
- (new) hc
- (new) tz
- Options:
- timeZone
- (new) calendar
- (new) hourCycle
- (new) numberingSystem
- (deprecate) hour12
- resolvedOptions:
- timeZone
- numberingSystem
- calendar
- (new) hourCycle
- (deprecate) hour12
Here's the most complex situation.
We don't support an extension key hc (hourCycle) which can take one of four values (h12, h11, h23, h24)[0] and instead we use a boolean hour12 option.
I suggest that we add hourCycle option and hc to extension keys and optionally deprecate hour12.
On top of that, we should add tz to relevant extension keys, and add calendar and numberingSystem to options.
[0] http://www.unicode.org/reports/tr35/#Locale_Extension_Key_and_Type_Data
Looking for feedback.
cc @jshin
obviously, the deprecation process is the problem here. I wonder if we can figure a way to have some sort of synergy here to preserve the hour12 option. E.g.:
- if you're not using
hour12option at all, we set it to true ifhourCycleish12orh11, and to false ifhourCycleish23orh24. - if you're passing
hour12option, and nohourCycle, we can determine the right value forhourCyclebased on the locale information and the value ofhour12provided. - if you're passing both, we can probably trigger a runtime error.
All that seems to be very straight forward and completely backward compatible, but... there is always a but... there is one more combination that is more complicated:
- if you're providing the extension key
hc, and you're providinghour12option as well (and this could pretty much happen without changing the instantiation code), we will not be able to throw (it is a non-bc change).
I wonder what can we do in this case?
/cc @ericf @juandopazo
I wonder what can we do in this case?
Can we just assume that hourCycle is newer than hour12, so if both are present just go with the newer slash more complete one?
No, because that is not BC. Imagine that there is an app that is suppose to use hour12, (e.g.: an abstraction of react-intl), and it receive a locale, and it infuses all the settings. At this point, if someone infuses a locale with extensions keys, the results might be different without changing the app code. Not a super big deal, but something to consider though.
Hmmm
How about - if both are present in the same place (options) then the newer one takes precedence. But generally, options take precedence over extension keys, so:
Intl.Locale('de-hc-h24', {
hour12: true
}).hourCycle; // 'h12'
Intl.Locale('de-hc-h24').hourCycle; // 'h24'
Intl.Locale('de', {
hour12: true,
hourCycle: 'h24'
}).hourCycle; // 'h24'
That some of the parameters can’t be set from the locale was a deliberate design choice in the first edition of the specification. From my introduction to the API:
The Unicode extension creates two issues:
- [...]
- Some of the parameters don’t really have anything to do with a language or locale; they’re orthogonal and applications should be able to fully control them (a language tag typically is a user setting). Currencies in particular depend on business requirements and should never be derived from a locale.
The algorithms involved in locale and parameter negotiation solve these issues in two ways:
- [...]
- The API distinguishes between three groups of parameters: those that are related to the locale and are always derived from the language tag, those that should be fully under application control and are solely obtained from the options object, and those that can be derived from the language tag, but also overridden by the application. Tables in the sections on Collator, NumberFormat, and DateTimeFormat below show for each parameter how it can be set.
The need to decide for each key in the Unicode extension how it should be treated in the API unfortunately means that the specification cannot allow implementations to support newly added keys.
oh, good to know !
So, this is just a revisit of the same area after 4 years I guess.
From the intro:
The API distinguishes between three groups of parameters: those that are related to the locale and are always derived from the language tag, those that should be fully under application control and are solely obtained from the options object, and those that can be derived from the language tag, but also overridden by the application.
I guess that my point is that there seem to be more parameters that belong to the third group.
Examples:
- numberingSystem - I'm not sure why it cannot be coming from an option.
- hourCycle - that seems like an information that may come from a locale string
- calendar - another option that should be controllable through options
- collation - if we're allowing for extension keys to specify that, why not allow to control it via options? (with similar behavior of ignoring the subset of options resolved via
usage) - timeZone - since we are going to accept it as an option, and if that's not provided try to retrieve it from the host system, I'm not sure why wouldn't we also want to respect it being in the locale string?
The only one that seems to clearly fall into the 2nd category (extension keys that shouldn't affect the API) is cu.
While it's a valuable bit of information and I'd like navigator.locales to return it from the host system, it should not be used to format the number automatically, because we're not doing any conversion.
So, in the light of your comment, my updated proposal would be to do what I listed above minus cu extension key on NumberFormat which would remain ignored.
What do you think @NorbertLindenberg ?
I'm obviously interested to know @NorbertLindenberg's ideas about the above comment, but FWIW I support the goal of this issue, i.e., I believe that having an improved consistency between the mapping of formatter options and its correspondent locale Unicode keys is a good thing (as proposed above).
Thanks, @NorbertLindenberg, for chiming in. I meant to say what you said, but haven't managed to.
As @zbraniecki wrote, 'cu' clearly belongs to the 2nd category. I have a bit of reservation about 'timeZone' as well.
And, overall, I also support the goal of this issue.
As a general comment, I’d be cautious about adding parameters just for consistency. Everything you add to the API has a cost in terms of specification, conformance test suite, multiple implementations, multiple implementation test suites, and long-term maintenance of all of them. Without actual use cases that could be a high cost with no return.
On @zbraniecki's specific examples:
- numberingSystem: Are there use cases where applications would want to use a numberingSystem that doesn’t relate to the language in use? In common use cases a language may use two numbering systems, the traditional one for the language and the Western one, and the user prefers one or the other.
- hourCycle: Seems reasonable to add to locale. BCP 47 didn't have this in 2012.
- calendar: See https://bugs.ecmascript.org/show_bug.cgi?id=696
- collation: Use cases? Note that several of the values only make sense for a small set of languages.
- timeZone: I think of languages and time zones as independent entities. My language preferences don’t change when I travel from one time zone to another, or when I add an event to my calendar that has been scheduled in a different time zone. On the other hand, when I learn a new language, it doesn’t change my time zone. Why would you want to tie them together?
As a general comment, I’d be cautious about adding parameters just for consistency. Everything you add to the API has a cost in terms of specification, conformance test suite, multiple implementations, multiple implementation test suites, and long-term maintenance of all of them. Without actual use cases that could be a high cost with no return.
I understand.
Let's break down my proposed changes:
- Extension key
hc
I believe there's a major value in supporting this extension key, since it's an elegant way to provide us an ability to use proposed navigator.locales API to retrieve user preference for this value and feed it into Intl APIs.
- Option
hourCycle
This is not necessary, but I believe it's better to aim for an API that is less confusing if we can, and there's an easy path forward to add hourCycle with backward compatibility for hour12.
- Extension key
tz
We currently support providing timezone as an option and we also look up the timezone in the host system, but if the locale string has it specified, we ignore this setting.
If we make progress with Intl.Locale proposal and navigator.locales, we'll have a way to retrieve settings from the host environment in form of locale strings that will be provided to Intl APIs.
I think that once we get there, we will want to store timeZone in the locale string:
let loc = new Intl.Locale('pl', {
timeZone: 'PST'
});
loc.toString(); // pl-u-tz-pst
let loc2 = new Intl.Locale('pl-u-tz-pst');
loc2.timeZone; // 'PST'
// or:
navigator.locales = ['pl-u-tz-pst', 'en-US-u-tz-pst'];
date.toLocaleString(navigator.locales);
I understand your perspective, but I don't think there there will be a scenario where the locale string passed to the API will have tz, developer will not pass timeZone manually, and the user would not want the tz to be respected.
- option
calendar
I think it's worth to add it now.
- option
numberingSystem
I believe that there are use cases where app author knows he wants to provide arab numberingSystem irrelevant of user preferences. And once we start providing nu in navigator.locales from OS preferences (Win10 supports numbering system choice), it will be useful for the dev. to be able to override it.
- Extension key
collation
Yeah, not supper attached to it I guess. I don't see any use case.
Feedback?
~~BCP 47 is bigger than just JavaScript, so I think it's reasonable to present a number of things in that format. On the other hand, the options bag arguments are probably easier for developers to manipulate manually (especially given our lack of a built-in library for manipulating BCP 47 strings). Intuitively, supporting the full combination makes sense to me--it'd be nice to have a table somewhere to translate between BCP 47 names and options/resolvedOptions names, and ensure that we coordinate with other APIs about the name choices to the extent it makes sense.~~
~~As we're making changes here, we can also consider whether we should be adding more features based on things that are in BCP 47 flags that we don't currently expose (with those new features going through the staging process). The current BCP 47 tags can be found in the XML documents at http://unicode.org/repos/cldr/trunk/common/bcp47/ ; there's a lot that aren't made available in ECMA 402. This could solve a bunch of problems that have been mentioned, for example, if we put the BCP 47 fw calendar tag in resolvedOptions for DateTimeFormat, then that would give us the first day of the week.~~
EDIT: Rereading this thread, I don't think the first paragraph made much sense, given the prior discussion from @NorbertLindenberg and @jungshik , and the second paragraph might be solved another way.
I wrote the proposal for adding hourCycle to DateTimeFormat: https://github.com/zbraniecki/proposal-ecma402-hourcycle
@caridy , @littledan - there seems to be an initial consensus on adding this feature. I'd like to propose it for Stage 1 or 2, depending on if there are any concerns.
This proposal looks good to me. I appreciate the attention to detail here, including using [[hourCycle]] rather than [[hour12]] as a unified, normalized internal slot. I filed a bug about a potential compatibility issue. @jungshik , could you take a look?
current coverage of options and extension keys - https://docs.google.com/spreadsheets/d/1UoJap3c4Bqi6GCiT2CrG12-8-IFrxXFs9F25yKHcqrc/edit?usp=sharing
I started by collecting it via a script - https://github.com/zbraniecki/ecma402-compare-options-and-extkeys
but @littledan suggested that we should also look into the sources and try to see if there's more than this test covers.
We informed the committee about the proposal for hourCycle and deprecation of hour12. The response was supportive and we agreed that this only requires a PR, not a full spec proposal with stages.
At the same time, we agreed to give implementers time to react to the proposal, so we'll wait the next 2 months and if there will not be any more feedback, we'll merge it into the spec.
Committee also requested that we document the pattern used here for the deprecation/replacement for the future.
@juandopazo - are you still interested in leading the hourCycle PR effort?
I don't think this should be closed yet @caridy - we only added hc :)
fair enough.
hourCycle and hc are now available in the latest Firefox Nightly.
@zbraniecki Where exactly do we stand on this issue? Do we still have any APIs that pivot based on locale extension keywords but don't have a corresponding option in the options bag?
@sffc Yes, we do, for example we decided not to add the collation option. @NorbertLindenberg 's feedback in this thread.
I synced with @NorbertLindenberg on this topic at IUC last October.
The use case for having options able to be specified in the options bag rather than only in the locale string is that these user preferences often come from application settings. I'm a big advocate for using the locale string to hold the user preferences, but I also don't see harm in letting them be passed in on a case-by-case basis across the board.
@sffc I don't really see the harm either. From your conversation, did you establish that @NorbertLindenberg no longer holds his previous concerns?
I haven’t paid much attention to ECMA-402 in recent years, so please ignore whether I do or don’t hold my previous concerns. Instead, I’d suggest that you evaluate based on your knowledge of the current goals and constraints for ECMA-402, which differ in some ways from those for 1.0, which, if any, of the old concerns still apply, and proceed based on that evaluation.
@FrankYFTang volunteer to work on this issue.
Related: #380