perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

S_fold_constants: remove early SvREADONLY(sv) to allow SvIsCOW(sv)

Open richardleach opened this issue 3 years ago • 15 comments

Standard CONST PVs have the IsCOW flag set, meaning that COW can be used when assigning the CONST to a variable, rather than making a copy of the buffer. CONST PVs arising from constant folding have been lacking this flag, leading to unnecessary copying of PV buffers.

This seems to have occurred because a common branch in S_fold_constants marks SVs as READONLY before the new CONST OP is created. When the OP is created, the Perl_ck_svconst() check function is called - this is the same as when a standard CONST OP is created. If the SV is not already marked as READONLY, the check function will try to set IsCOW if it is safe to do so, then in either case will make sure that the READONLY flag is set.

This commit therefore removes the SvREADONLY(sv) statement from S_fold_constants(), allowing Perl_ck_svconst() to set the IsCOW and READONLY flags itself. Minor test updates are also included.

This is a possible method for addressing part of #20586. (This PR does not address the point about not constant folding when the resulting constant would be too large.)

richardleach avatar Dec 09 '22 23:12 richardleach

IM a touch concerned we should know why we stopped doing this in 5.20. Are we walking into a trap we should know about?

Yes, I don't know whether it was a deliberate omission or a bug that then was codified in Peek.t. Am hoping that @iabyn might know the answer to this. Will not merge until he has a chance to weigh in.

richardleach avatar Dec 10 '22 23:12 richardleach

On Sat, Dec 10, 2022 at 03:47:15PM -0800, Richard Leach wrote:

IM a touch concerned we should know why we stopped doing this in 5.20. Are we walking into a trap we should know about?

Yes, I don't know whether it was a deliberate omission or a bug that then was codified in Peek.t. Am hoping that @iabyn might know the answer to this. Will not merge until he has a chance to weigh in.

The change looks intentional and was introduced by the commit shown below.

It looks like swiping the string buffer from PADTMPs is still preferable to COWing in some cases, so those two expressions suddenly becoming COW might indicate pad-swiping is inadvertently being disabled. Or it might be that the disabling happened by accident (or design) some time after 5.20 and no one noticed, since the Peek test doesn't distinguish between not COWing and not pad swiping.

Here's some output on some old versions of perl:

$ for i in perl5180 perl5190 perl5191 perl5192 perl5193 perl5194 perl5195 perl5196 perl5197 perl5200 perl5360; do echo $i; $i -MDevel::Peek -e'Dump chr(256).chr(0).chr(512)' 2>&1 | grep COW; done perl5180 perl5190 perl5191 perl5192 perl5193 FLAGS = (PADTMP,POK,READONLY,IsCOW,pPOK,UTF8) COW_REFCNT = 0 perl5194 FLAGS = (PADTMP,POK,READONLY,IsCOW,pPOK,UTF8) COW_REFCNT = 0 perl5195 FLAGS = (PADTMP,POK,READONLY,IsCOW,pPOK,UTF8) COW_REFCNT = 0 perl5196 FLAGS = (PADTMP,POK,READONLY,IsCOW,pPOK,UTF8) COW_REFCNT = 0 perl5197 perl5200 perl5360

commit 9ffd39ab75dd662df22fcdafbf7f740838acc898 Author: Father Chrysostomos @.> AuthorDate: Wed Nov 13 18:10:49 2013 -0800 Commit: Father Chrysostomos @.> CommitDate: Sat Nov 30 05:53:16 2013 -0800

Allow PADTMPs’ strings to be swiped

While copy-on-write does speed things up, it is not perfect.  Take
this snippet for example:

$a = "$b$c";
$a .= $d;

The concatenation operator on the rhs of the first line has its own
scalar that it reuses every time that operator is called (its target).
When the assignment happens, $a and that target share the same string
buffer, which is good, because we didn’t have to copy it.  But because
it is shared between two scalars, the concatenation on the second line
forces it to be copied.

While copy-on-write may be fast, string swiping surpasses it, because
it has no later bookkeeping overhead.  If we allow stealing targets’
strings, then $a = "$b$c" no longer causes $a to share the same string
buffer as the target; rather, $a steals that buffer and leaves the tar-
get undefined.  The result is that neither ‘$a =’ nor ‘$a .= $d’ needs
to copy any strings.  Only the "$b$c" will copy strings (unavoidably).

This commit only applies that to long strings, however.  This is why:

Simply swiping the string from any swipable TARG (which I tried at
first) resulted in a significant slowdown.  By swiping the string from
a TARG that is going to be reused (as opposed to a TEMP about to be
freed, which is where swipe was already happening), we force it to
allocate another string next time, greatly increasing the number
of malloc calls.  malloc overhead exceeds the overhead of copying
short strings.

I tried swiping TARGs for short strings only when the buffer on the
lhs was not big enough for a copy (or there wasn’t one), but simple
benchmarks with mktables show that even checking SvLEN(dstr) is enough
to slow things down, since the speed-up this provides is minimal where
short strings are involved.

Then I tried checking just the string length, and saw a consistent
speed increase.  So that’s what this patch uses.  Programs using short
strings will not benefit.  Programs using long strings may see a 1.5%
increase in speed, due to fewer string copies.

-- A power surge on the Bridge is rapidly and correctly diagnosed as a faulty capacitor by the highly-trained and competent engineering staff. -- Things That Never Happen in "Star Trek" #9

iabyn avatar Dec 12 '22 13:12 iabyn

It looks like swiping the string buffer from PADTMPs is still preferable to COWing in some cases, so those two expressions suddenly becoming COW might indicate pad-swiping is inadvertently being disabled. Or it might be that the disabling happened by accident (or design) some time after 5.20 and no one noticed, since the Peek test doesn't distinguish between not COWing and not pad swiping.

I can definitely envisage buffer swiping being preferable to COW in some circumstances, such as the example given at the top of FC's commit.

However:

  • There seems to be no reason why simple CONSTs are IsCOW and folded CONSTs are not. Surely what's best for one is best for the other? Unless I'm just not seeing the reason, still feels like there's a bug.
  • [In blead] In $a = "$b$c" the RHS flags are (PADTMP,POK,pPOK) and pass the swiping test in Perl_sv_setsv_flags(). In my $x = "A"x10 the flags are (PADTMP,POK,READONLY,PROTECT,pPOK) and fail the swiping test, so a full buffer copy occurs.

I can't see how the PV of a constant can ever be swiped, so was the effect of FC's commit on folded constants unintentional, or is there still an argument for leaving things as-is? (Assignment of a folded constant to a PADTMP, via buffer copy, and then in subsequent operations it's beneficial to be able to swipe the PV buffer??)

richardleach avatar Dec 12 '22 16:12 richardleach

On Mon, Dec 12, 2022 at 08:55:46AM -0800, Richard Leach wrote:

I can't see how the PV of a constant can ever be swiped, so was the effect of FC's commit on folded constants unintentional, or is there still an argument for leaving things as-is? (Assignment of a folded constant to a PADTMP, via buffer copy, and then in subsequent operations it's beneficial to be able to swipe the PV buffer??)

I haven't looked closely, so I don't know whether FC's change had unintended side-effects. After constant folding there really aught to be no PADTMP involved, and involving one would likely be an error.

A PADTMP SV is just a private destination for intermediate ops, so in $a = $b + $c*$d, the multiply op will have a PADTMP reserved for it to store the result of $c*$d, so it doesn't have to allocate and mortalise an SV for that purpose. Similarly, the add op has a PADTMP to store the result of the addition, which is then copied to $a by the assign op. (And if $a is lexical, the assign is optimised away, and the add op uses the lexical variable $a as the target to write its results to directly.)

When an expression like "a" . "b" is constant folded by executing it at compile time, the result will be in a PADTMP SV, which aught to be removed from the concat op, have its PADTMP flag turned off, and attached to a const op.

Update: having had a quick play, it looks like the PADTMP flag isn't turned off on folded constants, but since the READONLY flag is turned on, sv_setsv_flags() (rightly) still thinks the string buffer is not stealable.

I suspect this is a bug and that the PADTMP flag aught to be turned off on newly-minted folded constants.

-- I thought I was wrong once, but I was mistaken.

iabyn avatar Dec 13 '22 08:12 iabyn

s/SvREADME/SvREADONLY/ ;-)

Leont avatar Dec 13 '22 11:12 Leont

@Leont - thanks for spotting the typo. @iabyn - I've added a commit to turn the PADTMP flag off for folded SVs. Such SVs will still have the READONLY flag set, of course, and won't be eligible for buffer swiping.

IOW the set of flags on a const folded SV with this PR would now be:

  FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK)

richardleach avatar Dec 13 '22 12:12 richardleach

I forgot to check tests. :man_facepalming: Will look tonight.

richardleach avatar Dec 13 '22 13:12 richardleach

The test failures when PADTMP is removed seem to relate to folding of references to the consts.

In t/op/join.t, there's for(1,2) { push @_, \join "x", 1 }, which in blead gives:

e                 <@> push[t4] vK/2 ->f
                     ...
d                    <1> srefgen sK/1 ->e
-                       <1> ex-list lKRM ->d
c                          <$> const[PV "1"] sRM/FOLD ->d

but without PADTMP yields:

                     <@> push[t3] vK/2 ->e
                     ...
c                    <$> const(IV \"1") s/FOLD ->d

Folding of references into literal constants also seems a bit funky. Have not had enough time to dig more.

richardleach avatar Dec 13 '22 23:12 richardleach

@richardleach, this pull request has languished since last month. In addition, it failed tests in the GH CI "sanity check" run, so it hasn't been through most of the CI runs. If you want to pursue this p.r., could you address those failures? Thank you very much.

jkeenan avatar Jan 17 '23 00:01 jkeenan

@jkeenan - This has languished because of a lack of time on my part and uncertainty as to whether there were unintentional side-effects in https://github.com/Perl/perl5/commit/9ffd39ab75dd662df22fcdafbf7f740838acc898 (as discussed above.)

The logical next step is to figure out what function is producing (what seem to be) invalid OP trees when the PADTMP flag is removed. I'll try to work on this in the coming month or so.

richardleach avatar Jan 18 '23 18:01 richardleach

Tests that fail when the PADTMP flag is removed seem to do so because of the assumption that the result of folding is as mutable or non-static as the result of the unfolded OPs would have been. That is to say, there is an expectation that folded CONSTs behave differently to non-folded CONSTs.

For example, the failing test in t/comp/old.t, boils down to the difference between: $ perl -MDevel::Peek -e 'for (1+2) { Dump($_) }' and $ perl -MDevel::Peek -e 'for (3) { Dump($_) }' Because of the PADTMP flag, the former outputs the following in blead:

SV = IV(0x5613452834c8) at 0x5613452834d8
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 3

and the latter gives:

SV = IV(0x5605e40b8118) at 0x5605e40b8128
  REFCNT = 2
  FLAGS = (IOK,READONLY,PROTECT,pIOK)
  IV = 3

With the PADTMP flag removed, both give the latter result and the test fails because $_ cannot be modified.

And as already mentioned a couple of messages above, t/op/join.t has a failing test which boils down to for(1,2) { push @_, \join "x", 1 } and the PADTMP flag controls whether the resulting OP tree looks like this (blead):

d                    <1> srefgen sK/1 ->e
-                       <1> ex-list lKRM ->d
c                          <$> const[PV "1"] sRM/FOLD ->d

or this (patched):

c                    <$> const(IV \"1") s/FOLD ->d

Specifically, this test in S_fold_constants is affected:

    case OP_SREFGEN:
        if (cUNOPx(cUNOPo->op_first)->op_first->op_type != OP_CONST
         || SvPADTMP(cSVOPx_sv(cUNOPx(cUNOPo->op_first)->op_first)))
            goto nope;

With a literal constant, e.g. for(1,2) { push @_, \1 }, the result is always a single CONST OP where the SV contains a RV to a SV. The test fails because while it seems to be expected that a reference to a literal constant always returns the exact same RV, the expectation is that a reference to a folded constant gives a different RV each time.

Coming back to iabyn's comment from earlier:

A PADTMP SV is just a private destination for intermediate ops, so in
$a = $b + $c*$d, the multiply op will have a PADTMP reserved for it to
store the result of $c*$d, so it doesn't have to allocate and mortalise an
SV for that purpose. Similarly, the add op has a PADTMP to store the
result of the addition, which is then copied to $a by the assign op.

It seems like the expected behaviour for constant-folded SVs is that they behave like intermediate SVs that would have otherwise been produced by OPs. The way that PADTMP and READONLY are both handled within the assign OPs is what allows that to happen.

So I'm suggesting that:

  1. Having PADTMP on folded constants isn't a bug after all. The second commit in the PR should be removed.
  2. https://github.com/Perl/perl5/commit/9ffd39ab75dd662df22fcdafbf7f740838acc898 was all about the efficiency of pad-swiping rather than COW, the buffer from a folded CONST SV will never be swipable because of the READONLY flag. In blead, my $x = <some_folded_pv> is a string copy operation because of the absence of IsCOW. I think the first commit in the PR is still useful and doesn't fly in the face of what FC was doing.

@iabyn or anyone else, thoughts?

richardleach avatar Feb 06 '23 00:02 richardleach

Looking back at this after nearly a year, I can't do anything with it without a steer on whether (1) PADTMP is undesirable on folded constants and therefore the existing core tests noted above are actually wrong (2) PADTMP is desirable.

Either way, I still hold with the note that 9ffd39a was all about the efficiency of pad-swiping rather than COW, and the buffer from a folded CONST SV will never be swipable because of the READONLY flag. In blead, my $x = <some_folded_pv> is a string copy operation because of the absence of IsCOW. Should we decide that PADTMP should be on folded constants, that doesn't fly in the face of what FC was doing in 9ffd39a.

richardleach avatar Feb 11 '24 22:02 richardleach

On Sun, Feb 11, 2024 at 02:39:07PM -0800, Richard Leach wrote:

Looking back at this after nearly a year, I can't do anything with it without a steer on whether (1) PADTMP is undesirable on folded constants and therefore the existing core tests noted above are actually wrong (2) PADTMP is desirable.

I think the 'allow SvIsCOW(sv)' commit should be merged (but perhaps not until after 5.40.0 - its only an optimisation and we're nearing code freeze).

I think the 'turn PADTMP flag off' commit should be rejected. Keeping the flag set seems to me to be wrong conceptually, but in practice we have enough subtle behaviour related to constant folding and places that make copies of PADTMPS (like for()), as shown by a few test failures, that the existing behaviour should be kept.

-- No matter how many dust sheets you use, you will get paint on the carpet.

iabyn avatar Feb 12 '24 11:02 iabyn

Thanks @iabyn. I'll update the PR tonight/soon and mark it defer-next-dev.

richardleach avatar Feb 12 '24 11:02 richardleach

Changes made as discussed. The Devel/Peek tests deliberately won't pass until we start the next development cycle.

richardleach avatar Feb 12 '24 23:02 richardleach