message-format-wg icon indicating copy to clipboard operation
message-format-wg copied to clipboard

The offset selection a lot clunkier than the MF1 one

Open mihnita opened this issue 3 months ago • 12 comments

These are some of the tests in the test/tests/functions/offset.json file

    {
      "src": ".local $x = {1 :offset add=1} .match $x 1 {{=1}} 2 {{=2}} * {{other}}",
      "exp": "=2"
    },
    {
      "src": ".local $x = {10 :integer} .local $y = {$x :offset subtract=6} .match $y 10 {{=10}} 4 {{=4}} * {{other}}",
      "exp": "=4"
    }

Let's take the first one and wrap it for readability:

.local $x = {1 :offset add=1}
.match $x
  1 {{=1}}
  2 {{=2}}
  * {{other}}

In MF1 the exact selection is done on the value without an an offset applied, only the keyword comparisons apply the offset.

So the result would be "=1", not "=2".

Now, one might argue that the the test matches the :offset selection as currently described in "spec/functions/number.md".

I know I explained more than once that what the MF1 behavior is. I can try to track down if this was in the end decided by a vote and I was overruled, or it was just a case of mistake / misunderstanding.

This is the example listed in the spec/functions/number.md file, "The :offset function" section:

.input {$like_count :integer}
.local $others_count = {$like_count :offset subtract=1}
.match $like_count $others_count
0 *   {{Your post has no likes.}}
1 *   {{{$name} liked your post.}}
* one {{{$name} and {$others_count} other user liked your post.}}
* *   {{{$name} and {$others_count} other users liked your post.}}

It works with the current spec, but it is more verbose and clunkier than MF1, which requires a single selector:

{like_count, plural, offset:1
  =0    {Your post has no likes.}
  =1    {{name} liked your post.}
  one   {{name} and # other user liked your post.}
  other {{name} and # other users liked your post.}
}

mihnita avatar Sep 16 '25 21:09 mihnita

I think the test should match exactly what the example in the spec is.

macchiati avatar Sep 16 '25 23:09 macchiati

It would be somewhat less clunky if the $others_count could be embedded and .input {$like_count :integer} omitted, eg

.match $like_count {$like_count :offset subtract=1}
0 *   {{Your post has no likes.}}
1 *   {{{$name} liked your post.}}
* one {{{$name} and {$others_count} other user liked your post.}}
* *   {{{$name} and {$others_count} other users liked your post.}}

macchiati avatar Sep 16 '25 23:09 macchiati

@mihnita Could you clarify if there's something you think we should be changing, and if so, what? As is, I'm having a hard time figuring out anything actionable in this issue.

It would be somewhat less clunky if the $others_count could be embedded and .input {$like_count :integer} omitted, eg

.match $like_count {$like_count :offset subtract=1}
0 *   {{Your post has no likes.}}
1 *   {{{$name} liked your post.}}
* one {{{$name} and {$others_count} other user liked your post.}}
* *   {{{$name} and {$others_count} other users liked your post.}}

@macchiati Note that the last two variants here use $others_count as a placeholder, which isn't defined in your example.

eemeli avatar Sep 17 '25 05:09 eemeli

Good point; there would be no easy way to de-clunkify. (And I remember there were long discussions about this; sorry for the red herring.)

macchiati avatar Sep 17 '25 23:09 macchiati

@mihnita Could you clarify if there's something you think we should be changing, and if so, what? As is, I'm having a hard time figuring out anything actionable in this issue.

I am proposing to make :offset do selection the exact same way plural-with-offset works in MF1.


This is not only about the current construct being "clunky", which can be in the eyes of the beholder.

But think what this means for localization. Most (all) localization tools and translators will have to "expand" both the :number value and :offset to all the CLDR plural keywords.

:offset needs expansion here because "{$others_count} other user(s)" must have it for grammatical correctness. And :number needs expansion in general, because it is the equivalent of a "regular plural".

We can propose some kind of complicated algorithm to detect a combination :number + :offset in the same message AND that the :offset is derived from the same value as :number to reduce the number of combinations (not fully expand :number).

But even now devs and l10n tools have problems with multiple plurals / selector, and have problems differentiating between =1 and one (in MF1). Some will do

zero *   {{Your post has no likes.}}
one  *   {{{$name} liked your post.}}
*    one {{{$name} and {$others_count} other user liked your post.}}
*    *   {{{$name} and {$others_count} other users liked your post.}}
*    *

Double-plurals are "pricey" enough that we banned them for MF1. And now we design something that makes them mandatory.

mihnita avatar Sep 20 '25 18:09 mihnita

Note: I am not arguing to make this change for LDML 48 (in case it was not obvious)

mihnita avatar Sep 20 '25 18:09 mihnita

Double-plurals are "pricey" enough that we banned them for MF1.

Actually, MF1 didn't. MF1 (when written correctly) uses nesting, which is difficult to write. The thing that you're probably thinking of is Android's getQuantityString.

My experience is that multiple plurals in messages are pretty common. They are, naturally, an order of magnitude less common than single-plural messages. But they are not uncommon. We had a long discussion of this when designing the selection model.

Note: I am not arguing to make this change for LDML 48 (in case it was not obvious)

We have a stability policy which probably precludes a number of the changes you're suggesting after LDML48 as well. Only things that are still draft can change. :offset is not draft. It's therefore stable. We can deprecate it and replace it with a different function or set of functions. But we can't make wholesale changes to it.

aphillips avatar Sep 21 '25 17:09 aphillips

Actually, MF1 didn't. MF1 (when written correctly) uses nesting, which is difficult to write.

I know that MF1 didn't. "We" Google did. Our lint forbids strings with "partial selection" ("...{foo,plural...} ... {bar,select,...}"), all selection "nested branches" must contain the full message, which makes them 100% equivalent to the multiple selection we have in MF2. And the lint also forbids double plurals.

The thing that you're probably thinking of is Android's getQuantityString.

No, I'm not thinking about that.

My experience is that multiple plurals in messages are pretty common. ... But they are not uncommon.

I agree that there are valid use cases for them. But they are still forbidden in our linter. And rare or not, now we make all offset messages, which used to be one selection, into double selection.

mihnita avatar Sep 21 '25 17:09 mihnita

:offset is not draft. It's therefore stable.

ACK. I tracked how that happened, but that's not the point here.

So that makes my proposal invalid "I am proposing to make :offset do selection the exact same way plural-with-offset works in MF1."

But this is an issue. Not all issues must come with a proposed solution.

mihnita avatar Sep 21 '25 17:09 mihnita

Maybe a way to handle this is to change the recommended pattern:

.input {$like_count :integer}
.local $others_count = {$like_count :offset subtract=1}
.match $others_count
-1   {{Your post has no likes.}}
 0   {{{$name} liked your post.}}
 one {{{$name} and {$others_count} other user liked your post.}}
 *   {{{$name} and {$others_count} other users liked your post.}}

Would that work?

It is un-intuitive, and one would have to read the documentation. But the double selection + offset construct is also not intuitive and one needs to read the doc.

This would not change the spec on how :offset works. And the -1 form is not mandatory, one can use the double selection form if they want.

Would such a change count as "changing a stable function in the spec"?

Because this example is in spec/functions/number.md under "The :offset function" section. But it is introduced with "For example, it can be used in a message such as this ..."

mihnita avatar Sep 21 '25 17:09 mihnita

My experience is that multiple plurals in messages are pretty common. ... But they are not uncommon.

I agree that there are valid use cases for them. But they are still forbidden in our linter. And rare or not, now we make all offset messages, which used to be one selection, into double selection.

One way of looking at this is that plural offset messages have always featured selection on two different numerical values, num and num-offset. If this should be considered ok while selection on multiple plurals in general is not, could not your linter be adjusted to allow for this special case, much like MF1 encoded this as a special case?

Maybe a way to handle this is to change the recommended pattern:

.input {$like_count :integer}
.local $others_count = {$like_count :offset subtract=1}
.match $others_count
-1   {{Your post has no likes.}}
 0   {{{$name} liked your post.}}
 one {{{$name} and {$others_count} other user liked your post.}}
 *   {{{$name} and {$others_count} other users liked your post.}}

Would that work?

While that would technically work, I would not recommend that approach. We do support multiple selectors, and this is a case where multiple selectors are warranted: The selection is being done on two different numbers.

One clarification we should make is adding a select=exact on the $like_count to help ensure that tools won't try to fill it out with plural categories:

.input {$like_count :integer select=exact}
.local $others_count = {$like_count :offset subtract=1}
.match $like_count $others_count
0 *   {{Your post has no likes.}}
1 *   {{{$name} liked your post.}}
* one {{{$name} and {$others_count} other user liked your post.}}
* *   {{{$name} and {$others_count} other users liked your post.}}

@mihnita Would that change also generalise decently well to your linter rules? You could continue to ban messages with multiple plural selectors, but the above message now only has one plural selector.

Another alternative here would be for ICU to define :icu:number or :icu:plural to behave with the exact same semantics as MF1 used, which would allow for using a single selector rather than two.

eemeli avatar Sep 22 '25 07:09 eemeli

plural offset messages have always featured selection on two different numerical values, num and num-offset

Well, but they didn't. I didn't have to write two nested selections, it was just one.

And I don't look at it that way. The selection was on one value.

For example if I design a library with an API like this:

someSelector(value1, value2)

But I tell my users "you must always, absolutely definitely, use this API like this":

value2 = offset(value1, delta)
someSelector(value1, value2)

Then it is on me to give them a helper method like this:

someSelector(value1, delta)

could not your linter be adjusted to allow for this special case, much like MF1 encoded this as a special case?

For that to work the linter would have to detect that the two variables we select one are "connected" by an offset.

And it is not only about the linter, all the l10n tools that expand plurals would need to understand that, and only expand the plural forms of the local variable with offset.

So it is added complexity in several tools. And the same complexity would be needed for all tools processing such messages, not only Google tools. We are just somewhat ahead of the curve because we supported such messages for a long time.


I did not open this ticket to complain about how to look at it. When you are forced to do something clunky extra explanations don't make friendlier to use.

In general my philosophy for libraries is: "if all my users of feature X have to jump through this convoluted step, it is my duty to make that step easier, even if that means writing some syntactic sugar APIs"

mihnita avatar Sep 22 '25 16:09 mihnita