test262 icon indicating copy to clipboard operation
test262 copied to clipboard

Fix Regex Character Class Escape Tests

Open Aurele-Barriere opened this issue 1 year ago • 9 comments

Character class escape tests for \d, \D, \s, \S, \w, \W are not testing what they should.

Consider the escape \d. We should test two things:

  • That it matches ALL digits (positive cases)
  • That it matches NONE of the other code points (negative cases).

This is not what the auto-generated tests are currently doing. character-class-digit-class-escape.js only checks that \d finds a match in "0123456789". And character-class-digit-class-escape-plus-quantifier.js only checks that \d+ finds a match in "0123456789".

First, these two tests are equivalent. But also, they do not check that \d matches all digits, only one. And they never check that non-digits are not matched. It seems that an implementation where \d matches only "3" and "z" would pass all the current tests.

I suggest the following:

  • check positive cases by testing ^\d+$ (with anchors) on the string "0123456789".
  • check negative cases by testing the absence of any match of \d on the string that contains all code points except digits.
  • do the same for each character class escape.

Thanks to Noé De Santo @Ef55 for realizing that these tests were probably not what they were intended to be, while working on the Warblre project.

Aurele-Barriere avatar Aug 14 '24 12:08 Aurele-Barriere

cc @mathiasbynens, you might be interested in this.

ptomato avatar Aug 14 '24 15:08 ptomato

CC @leobalter as the original author of these tests.

Thanks for kicking this off! Since these are generated tests, could you please update the source script that generates these files rather than editing them directly? It would be easier to review that patch.

mathiasbynens avatar Aug 15 '24 03:08 mathiasbynens

I'll give it a try in the next few days!

Aurele-Barriere avatar Aug 15 '24 12:08 Aurele-Barriere

I no longer have the code on any of my machines, the diff information points out to https://github.com/bocoup/test262-regexp-generator. I hope that helps.

leobalter avatar Aug 16 '24 01:08 leobalter

Hi, It seems like the tests currently in Test262 are not the same as the tests of the generator. Currently, the tests in the generator check that, for instance, str.replace(/\d/g,'') === str.replace(/|0-9]/g,'') where str is the string of all code points. I think these generator tests are fine: if \d matched too many things, then the string on the left side of the equality would be shorter. And if \d didn't match enough characters, it would be longer.

So I believe these generator tests correctly check positive and negative cases! I still think that the tests for the plus quantifier are redundant, but one could argue that it's just an extra sanity check.

It seems to me like the reason for the differences between the two sets of tests is due to this commit, optimizing the tests by removing this str.replace comparison.

I see two ways forward:

  • We could simply revert the optimization commit. The tests would be then agree with the generator. But probably this is too slow.
  • I could modify the generator to produce the tests that I submitted in this PR. These tests do not do any long string comparison nor use str.replace, so maybe this is fast enough?

What do you think?

Aurele-Barriere avatar Aug 21 '24 09:08 Aurele-Barriere

I could modify the generator to produce the tests that I submitted in this PR. These tests do not do any long string comparison nor use str.replace, so maybe this is fast enough?

IMHO this is the way to go.

mathiasbynens avatar Aug 21 '24 10:08 mathiasbynens

Hello, I've created a pull request here: https://github.com/bocoup/test262-regexp-generator/pull/2

To generate the correct strings, I've used the regenerate.js file from this repo: https://github.com/mathiasbynens/unicode-property-escapes-tests/

I've also added a configuration with the v flag.

Best, Aurèle

Aurele-Barriere avatar Sep 10 '24 16:09 Aurele-Barriere

We are considering copying bocoup/test262-regexp-generator into this repo. I'm not sure Bocoup is still maintaining it.

ptomato avatar Sep 25 '24 17:09 ptomato

Makes sense! Let me know and I can update this PR when the generator is copied here.

Aurele-Barriere avatar Sep 27 '24 08:09 Aurele-Barriere

I've copied and updated the generator script in https://github.com/tc39/test262/pull/4303. I had to make some modifications to it, to implement some changes that had been made directly to the test files and weren't reflected in the script. So any extra review is appreciated!

ptomato avatar Nov 02 '24 01:11 ptomato

@Aurele-Barriere The generator script has landed in test262 now, so you can bring the improvements to the generator script into this PR. I'm afraid it will need a certain amount of rebasing, but at least now you can be certain that the script output matches the tests in the repo.

ptomato avatar Nov 12 '24 01:11 ptomato

@Aurele-Barriere The generator script has landed in test262 now, so you can bring the improvements to the generator script into this PR. I'm afraid it will need a certain amount of rebasing, but at least now you can be certain that the script output matches the tests in the repo.

Great news thanks, I'll give it a try soon!

Aurele-Barriere avatar Nov 12 '24 15:11 Aurele-Barriere

Hi @ptomato I ported my changes to the test generator (from https://github.com/bocoup/test262-regexp-generator/pull/2) to the one now in Test262 I didn't realize sycing my repo would close this PR automatically. So I opened a new one here https://github.com/tc39/test262/pull/4364

Aurele-Barriere avatar Dec 22 '24 19:12 Aurele-Barriere