eslint-plugin-regexp icon indicating copy to clipboard operation
eslint-plugin-regexp copied to clipboard

`regexp/strict` and `regexp/match-any` conflict with `unicorn/better-regex`

Open JounQin opened this issue 1 year ago • 12 comments

Information:

  • ESLint version: v8.19.0
  • eslint-plugin-regexp version: 1.7.0

Description

/\[([^[\]]+)]/g

error from this plugin

Unescaped source character ']'

After fixed:

/\[([^[\]]+)\]/g

error from unicorn/better-regex

/\[([^[\]]+)\]/g can be optimized to /\[([^[\]]+)]/g

JounQin avatar Jul 04 '22 04:07 JounQin

/{{([\S\s]+?)}}/g

regexp/match-any is reported

/{{([\s\S]+?)}}/g

unicorn/better-regex is reported

JounQin avatar Jul 04 '22 04:07 JounQin

I think we should adapt https://github.com/SamVerschueren/clean-regexp and https://github.com/DmitrySoshnikov/regexp-tree like eslint-plugin-unicorn.

JounQin avatar Jul 04 '22 04:07 JounQin

I don't know what the unicorn/better-regex says is better, but I think theregexp/strict rule is working correctly.

/]/ is allowed by Annex B, but if we don't use Annex B (We describe it as strict), I think it's a regex that is strictly a syntax error.

https://ota-meshi.github.io/eslint-plugin-regexp/rules/strict.html

Also, the reported regex will result in a syntax error when trying to use the u flag.

ota-meshi avatar Jul 04 '22 05:07 ota-meshi

The regexp/match-any rule allows you to use options to choose your preferred notation.

https://ota-meshi.github.io/eslint-plugin-regexp/rules/match-any.html

ota-meshi avatar Jul 04 '22 05:07 ota-meshi

Raised https://github.com/sindresorhus/eslint-plugin-unicorn/issues/1852 to discuss.

I hope these two plugins can work better together by default, some rules can be extracted from unicorn to regexp instead?

JounQin avatar Jul 04 '22 05:07 JounQin

In my opinion, if you want to use unicorn/better-regex, recommend not to use eslint-plugin-regexp. If you want to use eslint-plugin-regexp, recommended not to use unicorn/better-regex.

unicorn/better-regex can be used without complicated options. However, detailed user preferences cannot be set. eslint-plugin-regexp has many options, so you can configure it to your preferences. I think they have different purposes.

ota-meshi avatar Jul 04 '22 05:07 ota-meshi

I think the default options should be best practices for most projects, so although these are two different plugins, but they can share why they choose different style, and what's the benefits and downsides, and then we can choose whether or not change the default behaviors.

JounQin avatar Jul 04 '22 05:07 JounQin

I don't know what the idea of unicorn/better-regex is to say "better". So I don't know if that makes sense. If we're not sure if that regexp is "better", it doesn't make sense to change the default behavior. Do you know about the "better" of unicorn/better-regex?

ota-meshi avatar Jul 04 '22 05:07 ota-meshi

Raised sindresorhus/eslint-plugin-unicorn#1852 to discuss.

I hope these two plugins can work better together by default, some rules can be extracted from unicorn to regexp instead?

That's why I raised the related issue for discussion with maintainers from eslint-plugin-unicorn, we can wait for a while for them?

JounQin avatar Jul 04 '22 05:07 JounQin

While I do not follow the development of eslint-plugin-unicorn, I dug up a bit of history regarding the better-regex rule.

The better-regex rule used to be called regex-shorthand until it was decided in https://github.com/sindresorhus/eslint-plugin-unicorn/issues/469 that better-regex is a better name. This was done with the intention of one day integrating the no-unsafe-regex rule into better-regex, although that has yet to happen.


I think we should adapt https://github.com/SamVerschueren/clean-regexp and https://github.com/DmitrySoshnikov/regexp-tree like eslint-plugin-unicorn.

Looking at how painfully simple clean-regex is implemented, I think we should exclude it from this discussion. The substitutions it does are already done (albeit in a more general way) by eslint-plugin-regexp anyway.

regexp-tree's optimizer is more interesting. The discussion should focus on that.


Going into the implementation details of regexp-tree's optimizer that cause the reported differences:

  1. /\[([^[\]]+)]/g: Ignoring special cases, the charEscapeUnescape transformation only allows these characters to be escaped.
  2. /{{([\S\s]+?)}}/g: This one is caused by character class sorting. More specifically, this line of code. Basically, meta classes (e.g. \s, \S, \d) are sorted by string-comparing their source code in byte order. Since ASCII "S" comes before ASCII "s", it will yield \S\s.

My thoughts on this:

  1. I want to point out that regexp-tree does not handle Unicode regexes all that well (https://github.com/DmitrySoshnikov/regexp-tree/issues/162). So the conflict with regexp/strict could simply be regexp-tree not considering Unicode regexes.
  2. Given that most people use \s\S, this might be a bug in the comparison function, as it goes against the optimizer's goal of using "idiomatic patterns."

RunDevelopment avatar Jul 04 '22 12:07 RunDevelopment

better-regex was dropped from unicorn

yakov116 avatar May 18 '23 12:05 yakov116

no... that's no-unsafe-regex

fisker avatar May 18 '23 12:05 fisker