cloc icon indicating copy to clipboard operation
cloc copied to clipboard

Multiple --not-match-f override each other, instead of being applied additively

Open includesec-erik opened this issue 1 year ago • 5 comments

Describe the bug When running the following command (and any similar command with more than one --not-match-f stated) the default expectation is that all files who have a basename ending with _test.go and _proto.go will not be included for counting consideration by cloc. The actual behavior of cloc seems to be that a single (I believe the last stated) not-match-f will be honored as a filter instead of all statements of --not-match-f

cloc . --not-match-f=".*\_proto.go" --not-match-f=".*\_test.go"

cloc; OS; OS version

  • cloc version: 1.96
  • Perl version: v5.14.2
  • Ubuntu Linux

To Reproduce See comment on this youtube video for repro: https://www.youtube.com/watch?v=eRLTkDMsCqs

Expected result All not-match-f filters are applied within cloc for filtering consideration instead of only one.

Thanks for considering this Al, perhaps we can change cloc's default behavior to be additive filter with this command line option instead of single filter respected? Apparently this unexpected behavior has been around a while!

BTW does this situation also apply to match-f, match-d, and not-match-d command line options as well?

includesec-erik avatar May 09 '23 03:05 includesec-erik

None of the --match-* or --not-match-* switches may be repeated. I didn't see the need since a single regex can handle multiple cases. Your two --not-match-f cases can be condensed to

cloc . --not-match-f=".*\_(proto|test).go"

I'm sure I'm overlooking situations where multiple copies of --not-match-f really are necessary. If you can describe such a use case I'll update the code to accommodate it.

AlDanial avatar May 11 '23 02:05 AlDanial

Hi @AlDanial, thanks for the reply! Given your info, I'd categorize this as an enhancement request issue, not a bug.

You're correct in stating that all possible matches can be thought of and specified in a single regex, thanks for pointing that out.

I would say though that for users who are less regex experienced, or when I'm trying to explain to another party how to use cloc over email/phone call, it is tremendously simpler to use multiple parameters to build a list of filters. From what I've seen from working with other tech professionals who use other command line tools, this is a commonly expected pattern (additive list of filters) that works in other tools (Tokei for instance).

I totally understand if implementing this behavior change is a big ask why you might want to decline this enhancement request, but if it is a smaller ask, please consider it! Thank you.

includesec-erik avatar May 11 '23 03:05 includesec-erik

It's not a big ask and I'm familiar with additive options (cloc's --force-lang and --script-lang can be specified multiple times). Still, the request will need to get on the back burner until I finish #722 (which will take me some time to implement cleanly).

AlDanial avatar May 11 '23 04:05 AlDanial

Sounds good @AlDanial Fight the good fight against Text::Glob!

includesec-erik avatar May 11 '23 05:05 includesec-erik

I've begun work on this; try the latest commit to kick the tires on additive --not-match-f and --not-match-d

AlDanial avatar May 27 '23 02:05 AlDanial

@AlDanial I think you implemented this and released it in 2023 right? Should we close the issue since --not-match-f and --not-match-d are now additive?

includesec-erik avatar Jul 25 '24 06:07 includesec-erik

An oversight! Yes, the fix was made more than a year ago. Always happy to close an issue.

AlDanial avatar Jul 26 '24 04:07 AlDanial