message-format-wg icon indicating copy to clipboard operation
message-format-wg copied to clipboard

[FEEDBACK] Isolating quoted patterns on the outside adds a lookahead to the syntax

Open eemeli opened this issue 1 year ago • 3 comments

An observation from implementing bidi isolation as proposed in #781, but which also applies to the currently proposed design for bidi usability:

Isolating quoted patterns on the outside adds LRI, RLI & FSI to the set of characters (currently { and .) that could start a quoted message with no declarations, as in \u2066{{hello}}\u2069.

This doesn't make the syntax ambiguous as the {{ isn't valid in a simple-message, but it does add a lookahead of one token to the parser.

The same lookahead is also required in variant, to determine whether a \u2066 starts a quoted key, or a quoted-pattern.

The simplest change to avoid this lookahead would probably be to place the open-isolate and close-isolate between the braces, as in {\u2066{hello}\u2069}. In this position, it would also match what's proposed for expression and markup.

eemeli avatar May 13 '24 08:05 eemeli

Putting the isolate between the pattern quotes would mean that there are two sequences for opening/closing. And it is harder for tools to insert (or remove) the isolates. It's cognitive burden on everyone, although admittedly it's clever.

Note that the isolates (unless inside of a literal) are ignorable and can be stripped from the message.

aphillips avatar May 13 '24 15:05 aphillips

Putting the isolate between the pattern quotes would mean that there are two sequences for opening/closing.

This is also the case with isolates outside the quotes. The current proposal has:

  • {{, \u2066{{, \u2067{{, or \u2068{{ for opening and
  • }} or }}\u2069 for closing the pattern;

I'm suggesting that we instead use

  • {{, {\u2066{, {\u2067{, or {\u2068{ for opening and
  • }} or }\u2069} for closing the pattern.

And it is harder for tools to insert (or remove) the isolates.

Both solutions are just as easy or hard to deal with. As MF2 may include e.g. |{{}}| as a valid quoted literal, a proper MF2 parser is required to apply any such changes.

eemeli avatar May 13 '24 16:05 eemeli

I think the difference is (especially if we make the pairing optional!) that the open and close isolates can just be ignored in the current design. With optional pairing, we can push the isolate characters back into the s production. Anyway, let's discuss.

aphillips avatar May 13 '24 16:05 aphillips

In #854, we accepted having this lookahead in the syntax.

eemeli avatar Sep 03 '24 16:09 eemeli