perl5 pp_subst + COW leaks memory

Description Since v5.19.1, perl has leaked memory when built with COW and a successful match occurs.

Steps to Reproduce The following is based upon https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/regexredux-perl-1.html and takes the same input of FASTA string. Comments show the approximate memory usage at each step.

# approx 50 MB input file.

my $seq = do { local $/; <STDIN> };
# 5.16.3         approx 102 MB RAM used at this point
# 5.18.0         approx 102 MB RAM used at this point
# 5.19.1 & later approx 55 MB RAM used at this point

$seq =~ s/>.*\n//g;
# 5.16.3         102 MB used
# 5.18.4:        102 MB used
# 5.19.1 & later 102 MB used

$seq =~ s/t/<4>/g;
# 5.16.3         175 MB used
# 5.18.4         175 MB used
# 5.19.1 & later 176 MB used

$seq =~ s/DS/<3>/g;
# 5.16.3         198 MB used
# 5.18.4         198 MB used
# 5.19.1 & later 248 MB used

$seq =~ s/NS/<2>/g;
# 5.16.3         197 MB used
# 5.18.4         197 MB used
# 5.19.1 & later 321 MB used

$seq =~ s/2/1/g;
# 5.16.3         197 MB used
# 5.18.4         197 MB used
# 5.19.1 & later 395 MB used

$seq =~ s/3/1/g;
# etc etc

Expected behavior Memory use reaches a steady level (e.g. 100 - 200 MB here), rather than going up by approx the size of the input string with each successful match.

Perl configuration Any perl since v5.19.1 with COW enabled.

Aug 31 '22 20:08 richardleach

Stepping through pp_subst with the script above:

TARG is never SvIsCOW at the start of pp_subst (was_cow == 0)
force_on_match == 0
r_flags == 1
CALLREGEXEC(rx, orig, strend, orig, 0, TARG, NULL, r_flags) returns true. This function also sets SvIsCOW_on(TARG), and the cow ref count ==1
dstr does not get upgraded
Some of the subst patterns start down the in-place branch, but immediately jump to the got_a_cow label in the not-in-place branch, so basically the same path is followed for all patterns in the script above.
The do blocks occur, doing the actual substitutions (I think, didn't look too closely at that bit)
if (rpm->op_pmflags & PMf_NONDESTRUCT) { is not true, so we get to the block:

#ifdef PERL_ANY_COW
            /* The match may make the string COW. If so, brilliant, because
               that's just saved us one malloc, copy and free - the regexp has
               donated the old buffer, and we malloc an entirely new one, rather
               than the regexp malloc()ing a buffer and copying our original,
               only for us to throw it away here during the substitution.  */
            if (SvIsCOW(TARG)) {
                sv_force_normal_flags(TARG, SV_COW_DROP_PV);
            } else
#endif
            {
                SvPV_free(TARG);
            }
            SvPV_set(TARG, SvPVX(dstr));
            SvCUR_set(TARG, SvCUR(dstr));
            SvLEN_set(TARG, SvLEN(dstr));
            SvFLAGS(TARG) |= SvUTF8(dstr);
            SvPV_set(dstr, NULL);

sv_force_normal_flags() calls S_sv_uncow:

static void
S_sv_uncow(pTHX_ SV * const sv, const U32 flags)
{
    assert(SvIsCOW(sv));
    {
        const char * const pvx = SvPVX_const(sv);
        const STRLEN len = SvLEN(sv);
        const STRLEN cur = SvCUR(sv);
        const bool was_shared_hek = SvIsCOW_shared_hash(sv);
        SvIsCOW_off(sv);
        if (len) {
            /* Must do this first, since the CowREFCNT uses SvPVX and
            we need to write to CowREFCNT, or de-RO the whole buffer if we are
            the only owner left of the buffer. */
            sv_buf_to_rw(sv); /* NOOP if RO-ing not supported */
            {
                U8 cowrefcnt = CowREFCNT(sv);
                if(cowrefcnt != 0) {
                    cowrefcnt--;
                    CowREFCNT(sv) = cowrefcnt;
                    sv_buf_to_ro(sv);
                    goto copy_over;
                }
            }
            /* Else we are the only owner of the buffer. */
        }
        else
        {
            /* This SV doesn't own the buffer, so need to Newx() a new one:  */
            copy_over:
            SvPV_set(sv, NULL);
            SvCUR_set(sv, 0);
            SvLEN_set(sv, 0);
            if (flags & SV_COW_DROP_PV) {
                /* OK, so we don't need to copy our buffer.  */
                SvPOK_off(sv);
            } else {
                SvGROW(sv, cur + 1);
                Move(pvx,SvPVX(sv),cur,char);
                SvCUR_set(sv, cur);
                *SvEND(sv) = '\0';
            }
            if (was_shared_hek) {
                        unshare_hek(SvSHARED_HEK_FROM_PV(pvx));
            }
        }
    }
}

Speculation: Something that might be happening is that TARG is the only actual thing pointing at its PV buffer - at least in the examples in the test script, but when S_sv_uncow does SvPV_set(sv, NULL);, then that buffer gets set adrift. (I don't know what the regex engine does with the buffer and quite why it sets cowrefcount = 1, so this speculation might be wrong.)

Aug 31 '22 20:08 richardleach

Hi. I'll try to pick this up in the next week or so.

Yves

On Wed, 31 Aug 2022, 21:44 Richard Leach, @.***> wrote:

Stepping through pp_subst with the script above:

TARG is never SvIsCOW at the start of pp_subst (was_cow == 0)

force_on_match == 0

r_flags == 1

CALLREGEXEC(rx, orig, strend, orig, 0, TARG, NULL, r_flags) returns true. This function also sets SvIsCOW_on(TARG), and the cow ref count ==1

dstr does not get upgraded

Some of the subst patterns start down the in-place branch, but immediately jump to the got_a_cow label in the not-in-place branch, so basically the same path is followed for all patterns in the script above.

The do blocks occur, doing the actual substitutions (I think, didn't look too closely at that bit)

if (rpm->op_pmflags & PMf_NONDESTRUCT) { is not true, so we get to the block:

#ifdef PERL_ANY_COW /* The match may make the string COW. If so, brilliant, because that's just saved us one malloc, copy and free - the regexp has donated the old buffer, and we malloc an entirely new one, rather than the regexp malloc()ing a buffer and copying our original, only for us to throw it away here during the substitution. */ if (SvIsCOW(TARG)) { sv_force_normal_flags(TARG, SV_COW_DROP_PV); } else #endif { SvPV_free(TARG); } SvPV_set(TARG, SvPVX(dstr)); SvCUR_set(TARG, SvCUR(dstr)); SvLEN_set(TARG, SvLEN(dstr)); SvFLAGS(TARG) |= SvUTF8(dstr); SvPV_set(dstr, NULL);

sv_force_normal_flags() calls S_sv_uncow:

static void S_sv_uncow(pTHX_ SV * const sv, const U32 flags) { assert(SvIsCOW(sv)); { const char * const pvx = SvPVX_const(sv); const STRLEN len = SvLEN(sv); const STRLEN cur = SvCUR(sv); const bool was_shared_hek = SvIsCOW_shared_hash(sv); SvIsCOW_off(sv); if (len) { /* Must do this first, since the CowREFCNT uses SvPVX and we need to write to CowREFCNT, or de-RO the whole buffer if we are the only owner left of the buffer. / sv_buf_to_rw(sv); / NOOP if RO-ing not supported / { U8 cowrefcnt = CowREFCNT(sv); if(cowrefcnt != 0) { cowrefcnt--; CowREFCNT(sv) = cowrefcnt; sv_buf_to_ro(sv); goto copy_over; } } / Else we are the only owner of the buffer. / } else { / This SV doesn't own the buffer, so need to Newx() a new one: / copy_over: SvPV_set(sv, NULL); SvCUR_set(sv, 0); SvLEN_set(sv, 0); if (flags & SV_COW_DROP_PV) { / OK, so we don't need to copy our buffer. */ SvPOK_off(sv); } else { SvGROW(sv, cur + 1); Move(pvx,SvPVX(sv),cur,char); SvCUR_set(sv, cur); *SvEND(sv) = '\0'; } if (was_shared_hek) { unshare_hek(SvSHARED_HEK_FROM_PV(pvx)); } } } }

Speculation: Something that might be happening is that TARG is the only actual thing pointing at its PV buffer - at least in the examples in the test script, but when S_sv_uncow does SvPV_set(sv, NULL);, then that buffer gets set adrift. (I don't know what the regex engine does with the buffer and quite why it sets cowrefcount = 1, so this speculation might be wrong.)

— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/20206#issuecomment-1233402714, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R63LUYAPPY33VLDZTTV36723ANCNFSM6AAAAAAQBW74IA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Oct 11 '22 08:10 demerphq

Sorry it has take me a lot longer than "next week".

When I check into this it appears that this is fixed.

cat t.pl 
use strict;
use warnings;
use strict;
 
sub get_smaps {
    my $proc_id= shift || "self";
    my $smaps_file= "/proc/$proc_id/smaps";
    open my $fh, "<", $smaps_file
        or do {
            my $errnum= 0+$!; # numify
            my $errmsg= "$!"; # stringify
            my $msg= "In get_smaps_summary, failed to read '$smaps_file': [$errnum] $errmsg";
 
            die $msg;
        };
    my %sum;
    while (<$fh>) {
        next unless substr($_,-3) eq "kB\n";
        my ($field, $value)= split /:/,$_;
        no warnings 'numeric';
        $sum{$field}+=$value if $value;
    }
    close $fh;
    return \%sum;
}

my @smaps;
push @smaps, get_smaps();

my $seq = do { local $/; <STDIN> };
# 5.16.3         approx 102 MB RAM used at this point
# 5.18.0         approx 102 MB RAM used at this point
# 5.19.1 & later approx 55 MB RAM used at this point

push @smaps, get_smaps();
print "read\n";

$seq =~ s/>.*\n//g;
# 5.16.3         102 MB used
# 5.18.4:        102 MB used
# 5.19.1 & later 102 MB used

push @smaps, get_smaps();
print "repl 1\n";

#$seq =~ s/t/4/g;
$seq =~ s/t/<4>/g;
# 5.16.3         175 MB used
# 5.18.4         175 MB used
# 5.19.1 & later 176 MB used

push @smaps, get_smaps();
print "repl 2\n";

$seq =~ s/DS/<3>/g;
# 5.16.3         198 MB used
# 5.18.4         198 MB used
# 5.19.1 & later 248 MB used

push @smaps, get_smaps();
print "repl 3\n";

$seq =~ s/NS/<2>/g;
# 5.16.3         197 MB used
# 5.18.4         197 MB used
# 5.19.1 & later 321 MB used

push @smaps, get_smaps();
print "repl 4\n";

$seq =~ s/2/1/g;
# 5.16.3         197 MB used
# 5.18.4         197 MB used
# 5.19.1 & later 395 MB used

push @smaps, get_smaps();
print "repl 5\n";

$seq =~ s/3/1/g;
push @smaps, get_smaps();

for my $i (1..$#smaps) {
    my $before= $smaps[$i-1];
    my $after= $smaps[$i];
    foreach my $key (sort keys %$after) {
        next unless $after->{$key};
        my $diff= $after->{$key} - $before->{$key};
        next unless $diff;
        print "Step $i: $key diff = $diff\n";
    }
    print "\n";
}
print "done\n";

produces this output:

Step 1: Anonymous diff = 49644
Step 1: KernelPageSize diff = 4
Step 1: MMUPageSize diff = 4
Step 1: Private_Dirty diff = 49644
Step 1: Pss diff = 49644
Step 1: Referenced diff = 49644
Step 1: Rss diff = 49644
Step 1: Size diff = 49644

Step 2: Anonymous diff = 49644
Step 2: KernelPageSize diff = 4
Step 2: MMUPageSize diff = 4
Step 2: Private_Dirty diff = 49644
Step 2: Pss diff = 49644
Step 2: Referenced diff = 49644
Step 2: Rss diff = 49644
Step 2: Size diff = 49644

Step 3: Anonymous diff = 72276
Step 3: KernelPageSize diff = 4
Step 3: MMUPageSize diff = 4
Step 3: Private_Dirty diff = 72276
Step 3: Pss diff = 72276
Step 3: Referenced diff = 72276
Step 3: Rss diff = 72276
Step 3: Size diff = 73976

Step 4: Anonymous diff = 72284
Step 4: KernelPageSize diff = 4
Step 4: MMUPageSize diff = 4
Step 4: Private_Dirty diff = 72284
Step 4: Pss diff = 72284
Step 4: Referenced diff = 72284
Step 4: Rss diff = 72284
Step 4: Size diff = 72280

Step 5: Anonymous diff = 72288
Step 5: KernelPageSize diff = 4
Step 5: MMUPageSize diff = 4
Step 5: Private_Dirty diff = 72288
Step 5: Pss diff = 72288
Step 5: Referenced diff = 72288
Step 5: Rss diff = 72288
Step 5: Size diff = 72288

Step 6: Anonymous diff = 72288
Step 6: KernelPageSize diff = 4
Step 6: MMUPageSize diff = 4
Step 6: Private_Dirty diff = 72288
Step 6: Pss diff = 72288
Step 6: Referenced diff = 72288
Step 6: Rss diff = 72288
Step 6: Size diff = 72288

Step 7: Anonymous diff = 72288
Step 7: KernelPageSize diff = 4
Step 7: MMUPageSize diff = 4
Step 7: Private_Dirty diff = 72288
Step 7: Pss diff = 72288
Step 7: Referenced diff = 72288
Step 7: Rss diff = 72288
Step 7: Size diff = 72288

Richard, can you confirm?

Jan 21 '23 17:01 demerphq

I'm confused by this. Doesn't this output show that the memory usage is still increasing, as per the original post?

Jan 21 '23 22:01 richardleach

It's possible It's me who is confused, but it looks to me like it stabilizes at 72288 pages. There is no change in steps 5-7. Isn't that what you expected? The earlier regexps make the string longer. Your data showed growth at each step didn't it?

Jan 22 '23 01:01 demerphq

Doesn't each step show the difference from the previous step's numbers? i.e. isn't it the increase in memory usage per step, not the memory usage ceiling, that stabilizes?

Jan 22 '23 18:01 richardleach

Doh. blush. Now I feel dumb. Of course you are right. Can't believe I forgot I had added the diff. Thanks for setting me straight.

Jan 23 '23 03:01 demerphq

Ok, so this doesn't happen when perl is built with -DSAWAMPERSAND. With the following output:

Step 1: Anonymous diff = 49652
Step 1: KernelPageSize diff = 4
Step 1: MMUPageSize diff = 4
Step 1: Private_Dirty diff = 49652
Step 1: Pss diff = 49652
Step 1: Referenced diff = 49652
Step 1: Rss diff = 49652
Step 1: Size diff = 49644

Step 2: Anonymous diff = 12
Step 2: Private_Dirty diff = 12
Step 2: Pss diff = 12
Step 2: Referenced diff = 12
Step 2: Rss diff = 12

Step 3: Anonymous diff = 22632
Step 3: Private_Dirty diff = 22632
Step 3: Pss diff = 22632
Step 3: Referenced diff = 22632
Step 3: Rss diff = 22632
Step 3: Size diff = 24332

Step 4: Anonymous diff = 4
Step 4: Private_Dirty diff = 4
Step 4: Pss diff = 4
Step 4: Referenced diff = 4
Step 4: Rss diff = 4
Step 4: Size diff = -1696

Step 5: Anonymous diff = 12
Step 5: Private_Dirty diff = 12
Step 5: Pss diff = 12
Step 5: Referenced diff = 12
Step 5: Rss diff = 12
Step 5: Size diff = 8

It isn't strictly speaking a leak. It is a side-effect of using COW in the regex engine. We keep a copy of the string attached to the regex data structure. It won't be deallocated until there is another match using the same regex.

It is possible we could invalidate this once the pattern ceases to PL_curpm, but making that happen sensibly would likely be difficult as far as I can tell. For instance in

$seq =~ s/2/1/g;
$seq =~ s/4/2/g;

We might be able to get away with dropping the saved copy when the second substitituion replaces the first as PL_curpm. But consider this:

perl -le'$_="abcdef"; $x="foo"; s/(c)/C/g; print $1; { $x=~s/(foo)/bar/; print $1;  } print $1; s/(b)/C/g; print $1'
c
foo
c
b

IE, in this case we will replace PL_curpm, but we expect to restore it after, so we cant just free it. Maybe there is a clever way to fix this, but I will have to think about it and I welcome ideas if anyone has them.

I have to say reviewing all the issues related to COW and the regex engine I am starting to think that the regex engine shouldnt use COW, and we should revert the patches to the regex that make it do so. This would introduce another class of bugs, but those bugs are solvable using /p should someone encounter them. I will likely file a proposal to this effect. I am starting to view COW in the regex engine as a failed experiement we should revert. It fixes some edge case bugs, but IMO those bugs really didnt cause that much trouble in the past as we documented how to avoid them. On the other there really isn't anything a dev can do to work around the problems that COW causes. Quadratic performance while (//g), excess memory use, slower code, etc.

I think we should keep this open a while longer while we think about this, but as I understand this now it isnt a leak, its an unfortunate confluence of the way the regex engine works, and the way that COW works, and the fact that they actually don't work well together in practice.

Jan 23 '23 14:01 demerphq

perl5 perl5 copied to clipboard

pp_subst + COW leaks memory

perl5
perl5 copied to clipboard