ecma402 icon indicating copy to clipboard operation
ecma402 copied to clipboard

Merge PluralRules into NumberFormat (formatSelect)

Open sffc opened this issue 5 years ago • 17 comments
trafficstars

Time and time again, programmers are confused about how to use Intl.PluralRules, especially in ways that relate to rendered digits, like how to take the plural form of 1 versus 1.00 versus 1K.

In the ICU implementation, to solve this problem, we allow users to pass a FormattedNumber, the output of NumberFormatter, into PluralRules.

Here's a draft of how this could look in ECMAScript:

const fmt = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const { string, pluralForm } = fmt.formatSelect(2.5e6);
console.log(string, pluralForm);
// "2.5 M" many

Intl.PluralRules would still be useful for the case where you don't care about the rendered output, but the new API on Intl.NumberFormat would help clarify how to get the effective plural form for a formatted number.

The new APIs:

  • formatSelect returns { string, pluralForm }
  • formatToPartsSelect (or formatSelectToParts) returns { parts, pluralForm }

Thoughts?

@zbraniecki @echeran @longlho

sffc avatar Jan 10 '20 03:01 sffc

Could we instead offer an ability to pass a NumberFormat instance to PluralRules.select?

const nf = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const pr = new Intl.PluralRules("fr-FR");
pr.select(2.5e6, nf); // select using the number formatted with `nf` options?

zbraniecki avatar Jan 10 '20 03:01 zbraniecki

The current pattern on how to do this is to pass NumberFormat options into the PluralRules constructor:

const nf = new Intl.NumberFormat("fr-FR", {
    notation: "compact"
});
const pr = new Intl.PluralRules("fr-FR", nf.resolvedOptions());

I was thinking that putting sugar methods on NumberFormat might make it easier to use. It would also make it more efficient because implementation-wise, you only format the number once, and then you compute both the string and the plural form at the same time.

sffc avatar Jan 10 '20 03:01 sffc

In ICU, we first get back a FormattedNumber from NumberFormat as an intermediate output, then we get its string representation and/or use it to select the plural rule. Do we not have that in ES (only have a string output), and thus want to consolidate the cognitive overhead of the APIs?

If so, I think the idea makes sense. We don't seem to really create custom plural rules -- we take whatever comes by default from CLDR, which means selecting a plural rule has the same input data as what it takes to format a number. And I assume that this proposal just solves the case where you want both; otherwise, you can reuse existing APIs.

On closer look at the current way to create plural rules, it does seem a little wonky when compared to the ICU way of doing things. But I think that matters right now to the extent that we have large use cases of plural rules selection only (w/o formatting) vs. formatting (w/o plural rules selection) or both.

echeran avatar Jan 10 '20 04:01 echeran

Clarifying the issue for potential readers... There are two problems:

  • On some languages, the plural form may vary depending on whether it's treated as an integer or a decimal. For example, for Macedonian (mk) "1" is treated as the one plural form, but "1.0" is treated as the other plural form. [1]
  • On all languages, the formatted/displayed number may be different from the actual number (because of rounding and notation) and therefore the plural form may change. For example, let's suppose the actual number is 1.0005 (other plural form in English) but the formatted number is "1" or "1.0" depending on the used fraction digits options (one plural form in English). An example about compact notation: let's suppose the actual number is 1000 and the displayed number is 1K, @sffc in here, I can't think of an example where the plural form for both would be different (I am thinking of "1000 likes" vs "1K likes"), do you have any handy example we can use to illustrate? Thanks

1:

new Intl.PluralRules("mk").select(1)
// > "one"
new Intl.PluralRules("mk", {minimumFractionDigits: 1}).select(1)
// > "other"

rxaviers avatar Jan 10 '20 13:01 rxaviers

2.3e6 in fr-FR: "2 300 000 vues" (plural form "other")

But when compact notation is used: "2,3 millions de vues" (plural form "many")

sffc avatar Jan 10 '20 14:01 sffc

I'm trying to figure out the use case for this and so far off the top of my head it'd be useful for debugging. What are your anticipated use cases?

I think right now the confusion, at least for me, primarily comes from implicit fraction/significant digits resolution within NumberFormat, e.g ILD currency digits info that changes the default fraction digits.

The other thing to consider is plural within ICU MessageFormat as well, e.g

{count, plural, one{# book} other{# books}}

With this API seems like the signal is to do NumberFormat.formatSelect to be consistent w/ the rendered output in #. But then if we have

{count, plural, one{book} other{books}}

(no #, so no rendered number), then what should we do in that scenario?

longlho avatar Jan 11 '20 01:01 longlho

formatSelect does not add any new functionality; it just makes the existing functionality more discoverable, understandable, and efficient. Use cases are not a consideration.

sffc avatar Jan 13 '20 13:01 sffc

I plan to address this as part of my new proposal Intl.NumberFormat V3.

https://github.com/sffc/proposal-intl-numberformat-v3

sffc avatar Mar 21 '20 01:03 sffc

@sffc I still don't see any references to libraries or software that would need this feature. It seems quite insufficiently justified so far. Can you provide sources of why and who would need that?

zbraniecki avatar Mar 21 '20 04:03 zbraniecki

This isn't a feature; it's a refactoring of existing feature. You can refer back to the PluralRules proposal for the full list of use cases.

In message formatting, you generally want both the number and the plural form of the number. Right now you have to use two different Intl classes, which is unintuitive, clunky, and inefficient. (Do you need justification on those three adjectives?) This proposal means you can get both the formatted number and the plural form in one function call, which I claim is more ergonomic and efficient.

sffc avatar Mar 21 '20 05:03 sffc

Do you need justification on those three adjectives?

I would like to see an example of a library of software where this problem is exemplified.

I am a co-author of a localization system that uses both Intl.PluralRules and Intl.NumberFormat and I have not observed that problem nor do I see how it would apply to my system.

Therefore I'm curious what other cases exist which exemplify the problem you're addressing. Saying "very often engineers encounter..." or "time and time again users are confused..." is only valuable if you can point at examples of where they're confused or where they encountered.

My issue is that I have not seen anything that would validate that claims.

zbraniecki avatar Mar 21 '20 07:03 zbraniecki

Unintuitive: Previous discussions regarding confusion over Intl.PluralRules behavior: #373, #365, https://github.com/tc39/proposal-unified-intl-numberformat/issues/86. I have also seen users simply unaware that fraction digit settings need to be passed to Intl.PluralRules in order to get correct behavior (which led to issues such as ICU-20617). For example, the following code is incorrect, even in English, but to most non-i18n experts, it looks perfectly plausible:

function howManyStars(locale, count, strings) {
  const nf = new Intl.NumberFormat(locale, {
    minimumFractionDigits: 1,
    maximumFractionDigits: 1,
  });
  const pr = new Intl.PluralRules(locale);
  return `${nf.format(count)} ${strings[pr.select(count)]}`;
}

howManyStars("en-US", 2, { one: "star", other: "stars" });
// Correct: "2.0 stars"

howManyStars("en-US", 1, { one: "star", other: "stars" })
// Incorrect: "1.0 star"

Also, the following doesn't work, either, since trailing zeros are stripped from .select():

const pr = new Intl.PluralRules("en-US");
pr.select("1.0");  // "one", but should be "other"

Clunky and Inefficient: The above function could be re-implemented in a safer, more efficient way by using formatSelect, as follows:

function howManyStars(locale, count, strings) {
  const nf = new Intl.NumberFormat(locale, {
    minimumFractionDigits: 1,
    maximumFractionDigits: 1,
  });
  const result = nf.formatSelect(count);
  return `${result.string} ${strings[result.pluralForm]}`;
}

I see the plural form as being fundamentally tied to the formatted string. In my opinion, as an i18n engineer who has worked with clients trying to implement plural selection correctly, the model of plural selection having its own class that neither accepts nor produces a formatted string is simply wrong, and it leads to bugs such as the ones listed above.


All that said, I appreciate the criticism from other i18n experts in this thread. It could be that my mental model of plural selection isn't correct. I am fine pulling formatSelect from my NumberFormat v3 proposal if we don't have consensus on it.

sffc avatar Mar 23 '20 21:03 sffc

I agree w/ @zbraniecki. I'm not sure if this is needed as a top level API, but rather just having PluralRules & NumberFormat sharing more underlying abstract operations.

longlho avatar Mar 24 '20 21:03 longlho

@sffc what would you say for selectPluralCategory method on NumberFormat instead? This way the only surface increase is that NF may be used to get the plural category just like PluralRules can be.

zbraniecki avatar Mar 25 '20 02:03 zbraniecki

Bikeshed:

  1. Intl.NumberFormat.prototype.formatSelect returning { string, pluralForm }
    • Pros: All features in one place; easy to use correctly; works nicely with formatRange
    • Cons: Doubles number of terminal methods, from 4 to 8 (including formatToParts and formatRange); return value should be a value type, but Records are still only Stage 1
  2. Intl.NumberFormat.prototype.selectPluralCategory returning a string pluralForm
    • Pros: Simple, straightforward addition
    • Cons: Two function calls, reducing potential performance benefit of a single call
  3. Intl.NumberFormat.prototype.getPluralRules returning an Intl.PluralRules
    • Pros: Clean separation of functionality; Intl.PluralRules remains a first-class construction
    • Cons: No performance benefit over the status quo
  4. Intl.PluralRules.from taking an Intl.NumberFormat as an argument
    • Pros/Cons: Same as above

sffc avatar Apr 02 '20 07:04 sffc

We decided in the 2020-04-23 meeting to table this issue, because none of the proposed options solve the problem completely. We will still require documentation, even if we add new methods. I filed a ticket to follow up on the documentation:

https://github.com/tc39/ecma402-mdn/issues/13

sffc avatar May 22 '20 07:05 sffc

@eemeli has a proposal that fixes this for string inputs: https://github.com/eemeli/proposal-intl-keep-trailing-zeros

sffc avatar May 09 '25 00:05 sffc