message-format-wg
message-format-wg copied to clipboard
Syntax design should aid reader in what is translatable
If we invest in error fallbacking, we're likely to follow the decision made by Fluent to ask developers to always use named parameters, rather than positional, because named are more meaningful in a fallback scenario.
In such a case, we'll vastly minimize scenarios where #, 0 is used, in favor of emailCount, userName etc.
The resulting problem is that variables, references, variant names, argument names and argument values become similar to translatable words.
In Fluent, we tried to follow the rule that everything that is meant not to be translated has some sigil around it. For example compare
Welcome = { function ->
one: Welcome to the World, { user }
other: Welcome to the Worlds, { function2(user, units: two) }
}
with:
welcome-message-one = { FUNCTION() ->
[one] Welcome to the World, { $user }
*[other] Welcome to the Worlds, { FUNCTION2($user, minimumFractionUnits: "two") }
}
Several things to notice about this syntax:
- the message ID is dash-delimited. We don't enforce it but we strongly encourage longer, more meaningful ids that pose less risk of collision when combining multiple resources into a single context, and ids that may help user understand the message as last resort, but finally, also likely to help reader of the syntax know that
welcome-message-oneis not a sentence. - variables are denoted by
$because they show up often, and usually will have fairly short, english sounding names.$is meant to help user recognize that this is not translatable word - variant keys are in
[]brackets - function names are all-caps
- function arguments are long camel-case
- function argument values are in
""
We did make some shortcuts - message references don't have a sigil, and argument names may still be confusing, but I think that if we decide to go for syntax, and not just data model, we should strongly considering designing the syntax with readability in mind.
I think this is something worth discussing. I find that the Fluent conventions don't really map well to the way I see things. There is very little (no?) difference between "....{$foo}..." and "...{-foo}..." Yes, one comes from resources, and one comes as a parameter. But who cares? It makes not linguistic difference.
It is still unclear if "" in argument values are localizable or not.
The camel-case does not help for single words (is year localizable? Why not?)
Same for message ID. If it is too short and does not use - then it's localizable?
I would rather go towards a syntax where it is clear what is localizable and what is not based on the "grammar" of our language, not on naming conventions.
As a programmer prefer to know that something is an int because it is declared as int, not because it starts with i, something is private because it is declared to be private, not because it starts with _.
I think that if we want non-programmers to touch these strings and minimize errors we should have something less ambiguous. And maybe associated with some kind of "schema" to make things extensible.
I would rather go towards a syntax where it is clear what is localizable and what is not based on the "grammar" of our language, not on naming conventions.
Agree. I wasn't advocating for Fluent syntax per-se, and I know your opinion on it differs from mine. I was just using it as an example of the way of thinking about syntax which is organized around the topic.
It's unclear to me, but this appears to have been addressed by the creation of the syntax.
Closing resolve-candidates per discussion in 2023-07-24 call