fluent
fluent copied to clipboard
Semantic Comments Proposal: Variables
This is part of the series of proposals spanning out of the meta #16.
Variables
One of the core elements of the Fluent ecosystem is a model of passing a set of variables from the developer to localizer enabling the localizer to augument their translation.
Establishing a semantic way of describing the variables available to the localizer would help localization tools, checks and localizers themselves better understand the context in which they operate.
It would be especially beneficial for Rich Editor in Pontoon to provide better UX when operating on variables, and for checks to be able to raise warnings on misuse of the variables.
Variables can be considered part of #139, but they're such a core feature that I'd like to discuss them separately for two reasons:
First, being the most common and omnipresent example of a semantic information they should drive the whole conversation and we should design the meta-information model specifically taking into account this use case over others.
Secondly, because they're so dominant, we may want to consider separating them out of meta information and providing further syntax sugar to make them easier to work with.
So, as part of meta-information they could be represented as:
# @param $value (Number) - Value of the unit (for example: 4.6, 500)
# @param $unit (String) - Name of the unit (for example: "bytes", "KB")
sitedata-total-size = Your stored cookies, site data and cache are currently using { $value } { $unit } of disk space.
but since we know that $
sygil denotes a variable, we may want to provide additional syntax sugar to reduce the visual clutter:
# $value (Number) - Value of the unit (for example: 4.6, 500)
# $unit (String) - Name of the unit (for example: "bytes", "KB")
sitedata-total-size = Your stored cookies, site data and cache are currently using { $value } { $unit } of disk space.
It then could be visually represented in Pontoon as:
and linters would be able to determine use and misuse of variables and warn about unused variables if needed.
@zbraniecki @stasm
I'd love to see this, and it would be especially helpful for elm-fluent - which has recently reached a stage where it is usable in production.
elm-fluent might also be interesting to you as:
- Another implementation you can add to https://projectfluent.org/
- An example of an implementation in a very strongly, statically typed language.
elm-fluent works differently from most other implementations as it operates as a compiler that converts .ftl
to .elm
. To make the most of the Elm type system, all variables (i.e. arguments passed in to messages) are strongly typed. This makes it a very robust system - it's impossible to forget to pass a variable, for instance.
However, the static typing is a challenge in some cases, because the Fluent spec kind of assumes dynamic typing. For numbers, elm-fluent is able to infer a numeric type if NUMBER()
is used (obviously), or if a select expression using numerics or plural form categories is used. Without those, however, it can't infer it and has to assume a string variable.
It would be nice if we could explicitly state the type in a semantic comment.
In addition, there are other ways which I'd love to be able to use Elm's static typing. For example, some variables are essentially enums e.g. a user gender might be male
, female
or other
, and used in a select expression. Rather than passing these as strings, in elm-fluent it would make sense if we could generate or refer to a enum type. In the context of an Elm project, this would make it impossible for the developer using the messages to pass an invalid option. Perhaps semantic comments for variable types could enable this.
For example, adapting your suggestion, you could do:
@param $gender (String male|female|other)
Implementations that don't benefit from enums will interpret this as just a string, optionally validating the value. elm-fluent would generate or somehow locate a enum type that can literally only be male
, female
or other
.
+1 to the proposal and the strongly typed tooling use case.
Enum syntax bikeshed: TypeScript has string literal types (e.g. "one"
is a type), and also type unions. So, you can write a string-only enum like "male" | "female" | "other"
. I think that makes good prior art for the enum spelling:
$gender ("male" | "female" | "other")
That starts opening the door on the any
type discussion. Is this ok?
$numberish (String | Number) a string or a number
Or, maybe that should be spelled with a dedicated keyword:
$numberish (any) a string or a number
Or, maybe the user can omit the type to get any
-like behavior.
$numberish a string or a number
My vote: even though my heart is in the strict static typed world, fluent's origins are weakly typed. Let the user omit type info and the implicit behavior will be any
.
@spookylukey: Hey Luke! After a period of pondering the idea of compiling Fluent to statically typed modules (mainly TypeScript), I decided to look for prior art and found your project. It's been a good source of inspiration! Anyway, on topic:
In addition, there are other ways which I'd love to be able to use Elm's static typing. For example, some variables are essentially enums e.g. a user gender might be
male
,female
orother
, and used in a select expression. Rather than passing these as strings, in elm-fluent it would make sense if we could generate or refer to a enum type. In the context of an Elm project, this would make it impossible for the developer using the messages to pass an invalid option. Perhaps semantic comments for variable types could enable this.
Could you not infer the enum types from the variants themselves? So rather than copying the variants to the semantic comment, you'd generate it based on the variants in the selector. Then the type in the comment could either be Enum
, or even let that be inferred from the fact that it's used as a selector rather than a placeable.
The set of variants can vary between languages, but I imagine you could just take the set of all languages' enum options when exposing the top-level module.
Am I missing something? Are you thinking more in terms of communication from developers to localizers, while I'm thinking more in terms of communication from localizers to developers?
@miniyou Parsing that info from the FTL would be very complicated because not all language will have the same variants. You would need to parse and combine results from every one of your languages to be meaningful, and even then it would be an awkward workflow to require finished translation before your app understand its own variable types. I'm not sure semantic comments make the workflow less awkward, but this isn't an easy problem.
I think this discussion would be a great fit for https://discourse.mozilla.org/c/fluent. Mostly, because it's actually a discussion, and also because I am confident that variables are just a small part of the problem you're facing, and I'd rather not derail this issue more.