ecma262 icon indicating copy to clipboard operation
ecma262 copied to clipboard

Add spec text for RegExp Modifiers

Open rbuckton opened this issue 2 years ago • 6 comments

This adds the specification text for the Stage 3 RegExp Modifiers proposal.

Test262 tests can be found at https://github.com/tc39/test262/pull/3960.

rbuckton avatar Nov 15 '23 21:11 rbuckton

Is there a reason we use the RegularExpressionFlags production and restrict it with an early error instead of introducing a new, more restricted production?

Is there a reason not to? Modifiers are a strict subset of RegularExpressionFlags, and RegularExpressionFlags itself is not restricted in the grammar, only via semantics. If I were to create a RegularExpressionModifiers production, it would have the same definition as RegularExpressionFlags.

rbuckton avatar Dec 15 '23 22:12 rbuckton

I believe RegularExpressionFlags is not restricted in the grammar because RegularExpressionLiteral needs to not change when we add new flags. This is due to the overlap with division. We don't have that same constraint with the modifiers grammar though, so it should be able to be done with grammar restrictions and not early errors.

michaelficarra avatar Dec 16 '23 00:12 michaelficarra

I believe RegularExpressionFlags is not restricted in the grammar because RegularExpressionLiteral needs to not change when we add new flags. This is due to the overlap with division. We don't have that same constraint with the modifiers grammar though, so it should be able to be done with grammar restrictions and not early errors.

How would you propose it be written so as not to require early errors? Modifiers can appear in any order, cannot be duplicated within one or both of the modifier locations, and I do plan to extend the set of allowed modifiers over time with things like x-mode, so it needs to be fairly flexible. Even if we limit it to a subset of characters like i, m, and s for now, we either need an early error for duplicates, or a complex grammar like:

RegularExpressionModifiers:
  RegularExpressionModifierChars[+IgnoreCase, +Multiline, +DotAll]

RegularExpressionModifierChars[IgnoreCase, Multiline, DotAll]:
  [empty]
  [+IgnoreCase] `i` RegularExpressionModifierChars[~IgnoreCase, ?Multiline, ?DotAll]?
  [+Multiline] `m` RegularExpressionModifierChars[?IgnoreCase, ~Multiline, ?DotAll]?
  [+DotAll] `s` RegularExpressionModifierChars[?IgnoreCase, ?Multiline, ~DotAll]?

The more modifiers we add, the more production parameters we need, and the more complex the production becomes. Plus, this doesn't avoid the need for an early error to forbid (?i-i:).

rbuckton avatar Dec 16 '23 01:12 rbuckton

@rbuckton I'm not saying we can't use any early errors at all on the production, just that we don't need to use early errors to enforce the allowed flag characters when an alternative grammar could do the job. I'm happy to continue using early errors to check for duplicates.

michaelficarra avatar Dec 19 '23 18:12 michaelficarra

Now that both V8 and SpiderMonkey are shipping regex modifiers, I'm hoping to request Stage 4 at the October plenary. As such I'm working on updating this PR so that it is in a mergeable state.

@michaelficarra do you still want me to refactor modifiers to something other than RegularExpressionFlags?

rbuckton avatar Sep 26 '24 21:09 rbuckton

@michaelficarra I went ahead and added a RegularExpressionModifiers production that just restricts the allowed characters to ims, and continues to use an early error to check for duplicates.

rbuckton avatar Sep 26 '24 23:09 rbuckton

Outside of updating from main, is there anything preventing this from being merged now that it has Stage 4?

rbuckton avatar Jan 08 '25 19:01 rbuckton

@rbuckton It's waiting on editorial review from @syg.

michaelficarra avatar Jan 08 '25 19:01 michaelficarra