message-format-wg
                                
                                
                                
                                    message-format-wg copied to clipboard
                            
                            
                            
                        Should custom implementations of custom functions override the standard ones?
Example: let's say that in my app I use the ICU implementation of MessageFormat 2, which also comes with implementations of the standard functions (lets' say number, datetime, duration, plural).
Should I be able to register my own number formatter implementation, overriding the standard one?
Or I should register it as a custom formatter with a name that is guaranteed to not collide with the standard ones?
I think in the March discussion with CLDR TC we discussed using certain prefix for custom functions, the way ISO has reserved names for language / country IDs.
So if the rule is "custom functions start with x-" then I can register "x-com.mihai.number", but not "number"
Note: libraries are of course allowed to register standard functions, because they implement them.
Pros: allowing overrides makes things flexible. If one is not happy with the standard one they implement their own.
Cons: flexible :-) Imagine use 3rd party libraries, and they are all allowed to override the standard functions. We can have conflicts, and we never know what we get, because the order of the registration might also matter. So we technically have a mutable global object shared with the world.
For custom function names, I thought we settled on having interior dots, like in Java package names: com.google.googlenumber
API-wise, part of the question is whether we should have one or two "set functions" APIs -- standard vs. custom, or only one. And if only one, whether the implementation should always use a built-in standard registry anyway. And if only one, and it replaces the default/standard registry, then someone providing custom functions would need to get the standard registry from somewhere, clone it, and add custom functions. That could be somewhat expensive (cloning & extending maps etc.).
I'm having a hard time imagining how exactly a sensible hard restriction on function names could be worded, given that we're rather unlikely to have a single standard library shared by all implementations.
Consider datetime, for instance. For the ICU implementation, it makes sense for the options to match the options already available for datetime formatters, including e.g. something like skeleton. For the JS implementation, it also makes sense for the formatter options to match those already provided for Intl.DateTimeFormatter, and this rather explicitly does not include skeleton support.
Now, if we were to require that we have a single standard registry that includes a datetime formatter, that datetime would need to be some stricter subset that's the intersection of at least the ICU and JS implementations, which would not include skeleton. So would the ICU implementation include that datetime along with e.g. org.unicode.icu.datetime that would offer a wider set of options? Or would it rather offer a single implementation-defined datetime?
I do think that soft restriction should be included, maybe something like this?
Implementations SHOULD require that custom formatting functions include at least one period
.character.
The intent there is to guide towards establishing a hard restriction in practice, while allowing for an implementation to provide a way to override its defaults and not be restrictive regarding the API that it offers for its users.
I'm having a hard time imagining how exactly a sensible hard restriction on function names could be worded, given that we're rather unlikely to have a single standard library shared by all implementations.
HTML just defines the valid custom element name for the similar purpose.
However, I think it could be enough to:
- Guarantee that builtin functions will not contain any dots in their names.
 - Always pick the implementation's builtin function in case of name conflicts.
 
that datetime would need to be some stricter subset that's the intersection of at least the ICU and JS implementations
Or a superset (reunion), or something 100% different from both ICU/JS.
It is a new library and standard. We can use the chance to do some cleanup.
For example the ICU MessageFormat can do date, time with predefined styles (full/long/medium/short), but not datetime.
But both date and time can take a skeleton, and that can be anything.
Can be time with a yMMMd skeleton, which is a date.
And the ECMAScript date formatting has way too many ways to control the 12h/24h style, for example.
Implementations can claim to be compliant, or not. Should we "drag down" a new standard (that wants to be universal) with the limitations of the existing ones? If yes, then why stop at ICU and ECMA? Should we also do an intersection with features from Windows, macOS, Posix? Qt, GTK, .NET?
Why I think it is really important that the implementations "look the same" (even if they might behave slightly differently).
It would allow reusing messages across OSes / platforms.
If I have to maintain (and translate)
"... {exp :date skeleton=yMMMd} ..."
for desktop and
"... {exp :date year=numeric month=long day=numeric} ..."
for JavaScript, then 90% of the benefit of MessageFormat 2 being a standard goes out the window.
If I have to maintain more than one sets of messages, I might as well use different syntax for the message itself, who cares.
Since this is a Unicode standard, which is also universal. There is no Windows / Android / JavaScript Unicode.
A platform can either implement the standard, or use a custom (platform) function (for example js.datetime).
An ICU / JS intersection is too narrow, to the point where it is useless. But that is something to argue about when one tries to register something in the standard registry.
What would be important to agree on is the principle that the standard functions have the same options, everywhere.
Although the question is not about that.
The question is: if the standard registry defines datetime (implemented in a platform dependent or not), and the user of the implementation tries to define a custom function with the exact same name, what happens?
Use it? Error? Silently ignore the custom one and use the standard?
A benefit of allowing custom functions to override the standard implementation is to have shims.
So that one can implement and use a newer version of datetime on an old Android that can't update ICU, for example.
I think this may have been addressed by namespaces (and the associated work on the spec). Marking as resolve-candidate. Please comment if you think there is something else needing attention here.