ack3 icon indicating copy to clipboard operation
ack3 copied to clipboard

Problem with '--color-match' when using '*' or '?' (asterisk or question mark)

Open biburepo opened this issue 4 years ago • 5 comments

Take this simple file "test.txt" :

1 file found?
2 files found?
24 files found?
No files found.

And this simple query :

ack 's ' --color-match='bold rgb500 on_rgb550' 'test.txt'

which gives me line 2/3/4 from "test.txt" with "s " nicely colored bold-red on light-yellow.

Next query :

ack '(s )+' --color-match='bold rgb500 on_rgb550' 'test.txt'

the same, OK.

But :

ack '(s )*' --color-match='bold rgb500 on_rgb550' 'test.txt'

gives me 4 lines (as expected) BUT NO COLORS

Same with :

ack '(s )?' --color-match='bold rgb500 on_rgb550' 'test.txt'

NOTE With egrep all queries work as expected. (setting : export GREP_COLORS='ms=1;38;5;196;48;5;227')

egrep 's ' test.txt
egrep '(s )+' test.txt
egrep '(s )*' test.txt
egrep '(s )?' test.txt

BTW RipGrep (rg) works as expected. same as egrep. However Silver Searcher (ag) has even more complications, but this is not the place to dive deeper in that area.

biburepo avatar Jul 04 '21 15:07 biburepo

I get the same results, including on branch dev HEAD.

Additionally, ack '(?:s ){0,1}' --color-match='bold rgb500 on_rgb550' 'test.txt' does not colorize but ack '(?:s ){1,1}' --color-match='bold rgb500 on_rgb550' 'test.txt' does colorize, which i would infer means its the fact of optionality and not the syntax of the optionality nor the capturing of the parens that is interfering with highlighting/colorizing.

I'd have expected Greedy-optional ?+ or *+ to have colorized/highlighted, but no it doesn't.

Also, it's not --color-match specific; normal default highlighting fails with pattern that can match 0 chars even when it matches more. Compare ack '(?:s)? ' --color-match='bold rgb500 on_rgb550' 'test.txt' # colors ack '(?:s)?' --color-match='bold rgb500 on_rgb550' 'test.txt'# no colors

Hence, I infer (hypothesize) that what is killing the color/highlight is that the pattern can match 0 characters, so 0 are what are colorized?

Perhaps the highlighting code's use of the Pattern needs to be forced greedy if it ends with an optional quantifier that is not explicitly greedy or non-greedy quantifier-modifier. Ugh, that'd probably require walking the compiled RE AST to do correctly.

n1vux avatar Jul 04 '21 16:07 n1vux

This is arguably (almost) correct behavior. an RE that can match any line by matching 0 or more characters arguably should not only display the entirety of every file but highlight exactly 0 characters on each.

What would be more correct is if did highlight the longest match of an explicitly greedy RE that could still match 0 chars. Failing to do that is still a bug even if the original example is correct behavior.

n1vux avatar Jul 04 '21 17:07 n1vux

alternative hypothesis refuted -

It is not the case that a null sequence has been (almost correctly) VT100 highlighted; there is not an empty highlighted string ^[[30;43m^[[0m sequence in the output.

n1vux avatar Jul 04 '21 17:07 n1vux

Let's simplify this a bit:

$ ack 's*' test.txt
1 file found?
2 files found?
24 files found?
No files found.

None of the results lines are colored. We would expect that on the first line where there is no s to match. The others are more puzzling.

petdance avatar Jul 06 '21 13:07 petdance

Is it possible the zero-width first match is triggering last if $match_length <= 0;

$ ack 's*' test.txt /dev/null --column
test.txt
1:1:1 file found?
2:1:2 files found?
3:1:24 files found?
4:1:No files found.

jakebman avatar Mar 28 '23 23:03 jakebman