perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

"Reference to nonexistent or unclosed group in regex" with \g{-1}

Open rsFalse opened this issue 1 year ago • 4 comments

Description

m/(?:\g{-1}|(ab?))+/ terminates.

Steps to Reproduce \g{1} vs \g{-1}:

perl -wle ' print "\$&:[$&],\$1:[$1]" if "aab" =~ m/(?:\g{1}|(ab?))+/ '
perl-5.38.2
==========
$&:[aa],$1:[a]
perl -wle ' print "\$&:[$&],\$1:[$1]" if "aab" =~ m/(?:\g{-1}|(ab?))+/ '
perl-5.38.2
==========
Reference to nonexistent or unclosed group in regex; marked by <-- HERE in m/(?:\g{- <-- HERE 1}|(ab?))+/ at -e line 1.
Command terminated with non-zero status.

Here is an example with working regex containing \g{-1} inside of its group (modified from https://github.com/Perl/perl5/issues/10073 ):

perlbrew exec perl -wle 'print "not ok" if "xa=xaaa" =~ /^(xa|=?\g{-1}a){2}$/'
perl-5.38.2
==========

perl-5.36.0
==========
not ok

(...same output as with \g{1})

Expected behavior

m/(?:\g{-1}|(ab?))+/ - should it work the same as with \g{1}?

rsFalse avatar Jan 07 '24 00:01 rsFalse

I don't understand this report. Why do you expect \g{-1} to work the same as \g{1}?

mauke avatar Jan 07 '24 08:01 mauke

I don't understand this report. Why do you expect \g{-1} to work the same as \g{1}?

I didn't read documentation carefully, and thought that \g{-1} could see its group after, like it can see its group from the middle, when it is not still enclosed. But perlre documentation states:

... \g-1 and \g{-1} both refer to the immediately preceding capture group

Anyway, when the group is inside its parentheses m/(\g{-1})/, it works fine, even it does not precede, rather only the opening parenthesis precedes. Fine?:

perl -wle 'print "[$&][$1]" if "xxaab" =~ m/(x)(\g{-1}|ab?)+/'
perl-5.38.2
==========
[xaa][x]

... someone may think that \g{-1} should repeat (x), because its the last group which fully precedes, i.e. both opening and closing parentheses are before \g{-1}. Anyway, documentation demonstrates an example with \g{-3}, when even second last group isn't closed (only opening parenthesis precedes). And, that's great, that non-relative capture groups can see their groups further in the regex.

rsFalse avatar Jan 07 '24 12:01 rsFalse

Either I'm missing something subtle here, or this is expected behaviour. \g{-1} refers to a previous group, but in the example m/(?:\g{-1}|...)/ there is no preceeding group (there in fact is no subgroup at all), so there is nothing for \g{-1} to refer to. So I'm not sure where the alleged bug is

leonerd avatar Mar 14 '24 17:03 leonerd

So I'm not sure where the alleged bug is

Yah, this is expected behavior to me. Relative captures were added before we supported named captures, and are intended as a work around so you can compose snippets of pattern without knowing how many capture buffers there might be in a pattern.

demerphq avatar Mar 14 '24 17:03 demerphq