gitlint icon indicating copy to clipboard operation
gitlint copied to clipboard

Regex match vs search

Open scop opened this issue 3 years ago • 3 comments

Some rules' regular expressions are used with .search, some with .match, and the only source which one it is appears to be the gitlint source.

search: title-match-regex, body-match-regex match: author-valid-email, ignore-by-title, ignore-by-body, ignore-body-lines, ignore-by-author-name

I think it would be good to standardize on just one of these for all regex matching. I suggest going with .search, because .match is a Python oddity not present in any other language I know of. (It's kind of ironic and unfortunate that the rules that have match in their name are the ones for which we do not invoke match, but search.)

Then again, standardizing on one would be a breaking change. If that's not acceptable or will take some time, I think we should document which method is used with each rule. I'm available to help with that or making the search/match change (if we're going with search ;)), let me know.

scop avatar Dec 23 '21 12:12 scop

I think we can alternatively be smart for backwards compat until we can break it and issue deprecation log warnings until that expected release while giving folks an additional option to allow them to try the new behaviour

sigmavirus24 avatar Dec 23 '21 14:12 sigmavirus24

Good catch, I agree with the suggestion to standardize on search.

Wrt deprecation strategy:

  • I like the idea of a global feature flag to enable the new regex behavior (e.g. enable_regex_style_search=true, let me know if you have a better suggestion)
  • We don't have any case today where gitlint logs warnings. Should be easy enough though. To state perhaps the obvious, this should only be logged for users who are actually using these rules and the warning should disappear when the enable_regex_style_search flag is set.

jorisroovers avatar Feb 10 '22 20:02 jorisroovers

I wanted to open a new issue, but it seems this thread is the right one.

Today, I am a bit frustrated with "title-must-not-contain-word".

What I want to achieve:

  • discard commit messages like "Apply \d+ suggestion(s) to + file(s)". This is the default message when applying changes in gitlab merge request.

I don't know what is the best way to achieve this:

  • extend title-must-not-contain-word to support regex
  • add a new rule

I do not want to play with negative "title-match-regex" for this "simple" discard.


Edit: reading more carefully the documentation, it seems that I may achieve this with a named "title-match-regex".

jreybert avatar May 20 '22 09:05 jreybert

This was implemented in #345 which will go out with 0.18.0 soonish.

Implemented behavior:

  • Rules this applies to: author-valid-email, ignore-by-title, ignore-by-body, ignore-body-lines, ignore-by-author-name

  • If users haven’t explicitly set a custom regex for these rules, no regex matching is done and no warning is logged.

  • The only exception is author-valid-email which by default verifies the validity of the specified email address. It now uses re.search when the default regex is used, and re.match when a custom regex is used.

  • If users have set a custom regex, we will warn once for each rule that uses a custom regex and still use re.match semantics. Notice that this means that if users lint multiple commits at once using --commits we will still only print the warning for each rule only once.

  • If users specify general.regex-style-search=true, we switch to using re.search semantics and no longer print a warning.

    [general]
    regex-style-search=true
    
  • This is the printed warning:

    M1 - author-valid-email: gitlint will be switching from using Python regex 'match' (match beginning) to 'search' (match anywhere) semantics. Please review your author-valid-email.regex option accordingly. To remove this warning, set general.regex-style-search=True. More details: https://jorisroovers.github.io/gitlint/configuration/#regex-style-search
    
  • Docs have been updated accordingly (will be published with the release)

The general.regex-style-search flag will be enabled by default in a future gitlint release.

jorisroovers avatar Oct 24 '22 09:10 jorisroovers

Released as part of 0.18.0, closing!

jorisroovers avatar Nov 17 '22 10:11 jorisroovers