message-format-wg icon indicating copy to clipboard operation
message-format-wg copied to clipboard

Disallow whitespace as the first character of a reserved-body in a reserved-statement.

Open bhaible opened this issue 1 year ago • 1 comments

In the 'reserved-statement' nonterminal, there is an ambiguity if there is more than one whitespace character between the 'reserved-keyword' and the first non-whitespace character of the 'reserved-body', because these whitespace characters can be seen as part of the 's' nonterminal or as part of the 'reserved-body' nonterminal.

According to the principles explained in #725 and the proposed resolution of #721, it is not desired that a 'reserved-body' starts with a whitespace character; rather, such a whitespace character is meant to be interpreted as part of the preceding 's' nonterminal.

Test case:

.regex   /foo/{xyz}{{hello}}

This patch removes this ambiguity, by disallowing whitespace as the first character of a 'reserved-body' in a reserved-statement.

It thus fixes the first part of #721.

Details:

  • In the other occurrences of 'resolved-body' as well (in a 'reserved-annotation' or 'private-use-annotation') the leading whitespace is separated as well. This has no influence on the set of inputs that the 'reserved-annotation' and 'private-use-annotation' nonterminals can match, but highlights that the parser should better trim off this leading whitespace in these places before entering the 'resolved-body' into the data model.
  • Two nonterminals 'reserved-body-start' and 'resolved-body-part' are introduced, each referenced once. The purpose is clarity and to follow the common *-start / *-part idiom.

bhaible avatar Mar 15 '24 16:03 bhaible

I think the group is settled on this one over #730. Let's close #730 and concentrate on finishing this one off.

Adding a note to syntax.md is probably useful.

aphillips avatar Mar 18 '24 23:03 aphillips

The simplification suggested by @eemeli is now included.

bhaible avatar Mar 25 '24 15:03 bhaible