perl5
perl5 copied to clipboard
"Reference to nonexistent or unclosed group in regex" with \g{-1}
Description
m/(?:\g{-1}|(ab?))+/ terminates.
Steps to Reproduce
\g{1} vs \g{-1}:
perl -wle ' print "\$&:[$&],\$1:[$1]" if "aab" =~ m/(?:\g{1}|(ab?))+/ '
perl-5.38.2
==========
$&:[aa],$1:[a]
perl -wle ' print "\$&:[$&],\$1:[$1]" if "aab" =~ m/(?:\g{-1}|(ab?))+/ '
perl-5.38.2
==========
Reference to nonexistent or unclosed group in regex; marked by <-- HERE in m/(?:\g{- <-- HERE 1}|(ab?))+/ at -e line 1.
Command terminated with non-zero status.
Here is an example with working regex containing \g{-1} inside of its group (modified from https://github.com/Perl/perl5/issues/10073 ):
perlbrew exec perl -wle 'print "not ok" if "xa=xaaa" =~ /^(xa|=?\g{-1}a){2}$/'
perl-5.38.2
==========
perl-5.36.0
==========
not ok
(...same output as with \g{1})
Expected behavior
m/(?:\g{-1}|(ab?))+/ - should it work the same as with \g{1}?
I don't understand this report. Why do you expect \g{-1} to work the same as \g{1}?
I don't understand this report. Why do you expect
\g{-1}to work the same as\g{1}?
I didn't read documentation carefully, and thought that \g{-1} could see its group after, like it can see its group from the middle, when it is not still enclosed.
But perlre documentation states:
...
\g-1and\g{-1}both refer to the immediately preceding capture group
Anyway, when the group is inside its parentheses m/(\g{-1})/, it works fine, even it does not precede, rather only the opening parenthesis precedes.
Fine?:
perl -wle 'print "[$&][$1]" if "xxaab" =~ m/(x)(\g{-1}|ab?)+/'
perl-5.38.2
==========
[xaa][x]
... someone may think that \g{-1} should repeat (x), because its the last group which fully precedes, i.e. both opening and closing parentheses are before \g{-1}.
Anyway, documentation demonstrates an example with \g{-3}, when even second last group isn't closed (only opening parenthesis precedes).
And, that's great, that non-relative capture groups can see their groups further in the regex.
Either I'm missing something subtle here, or this is expected behaviour. \g{-1} refers to a previous group, but in the example m/(?:\g{-1}|...)/ there is no preceeding group (there in fact is no subgroup at all), so there is nothing for \g{-1} to refer to. So I'm not sure where the alleged bug is
So I'm not sure where the alleged bug is
Yah, this is expected behavior to me. Relative captures were added before we supported named captures, and are intended as a work around so you can compose snippets of pattern without knowing how many capture buffers there might be in a pattern.