perl5 OP_EMPTYAVHV - optimized empty ANONLIST/ANONHASH

This commit introduces a new OP to replace cases of OP_ANONLIST and OP_ANONHASH where there are zero elements, which is very common in Perl code.

As an example, my $x = {} is currently implemented like this:

...
6     <2> sassign vKS/2 ->7
4        <@> anonhash sK* ->5
3           <0> pushmark s ->4
5        <0> padsv[$x:1,2] sRM*/LVINTRO ->6

The pushmark serves no meaningful purpose when there are zero elements and the anonhash, besides undoing the pushmark, performs work that is unnecessary for this special case.

The peephole optimizer transforms this example into:

...
5     <2> sassign vKS/2 ->6
3        <@> anonavhv_mt_rv sK*/ANONHASH ->4
-           <0> ex-pushmark s ->3
4        <0> padsv[$x:1,2] sRM*/LVINTRO ->5

The following toy code shows the op-related speedup: for (0 .. 100_000_000) { {} }

With blead:

          3,485.86 msec task-clock                #    0.996 CPUs utilized          
                10      context-switches          #    0.003 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               199      page-faults               #    0.057 K/sec                  
    15,571,616,464      cycles                    #    4.467 GHz                    
        22,684,235      stalled-cycles-frontend   #    0.15% frontend cycles idle   
             5,861      stalled-cycles-backend    #    0.00% backend cycles idle    
    59,605,663,770      instructions              #    3.83  insn per cycle         
                                                  #    0.00  stalled cycles per insn
    13,601,121,374      branches                  # 3901.795 M/sec

Patched:

          3,031.49 msec task-clock                #    1.000 CPUs utilized          
                 9      context-switches          #    0.003 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               198      page-faults               #    0.065 K/sec                  
    13,499,894,113      cycles                    #    4.453 GHz                    
         8,209,347      stalled-cycles-frontend   #    0.06% frontend cycles idle   
             5,279      stalled-cycles-backend    #    0.00% backend cycles idle    
    50,605,323,929      instructions              #    3.75  insn per cycle         
                                                  #    0.00  stalled cycles per insn
    11,601,055,120      branches                  # 3826.844 M/sec

Aug 03 '22 22:08 richardleach

Comments:

I know the new op name is crufty. Suggestions welcome.
I wasn't sure how often the non-OPf_SPECIAL case of ANONHASH and ANONLIST comes up (B::Deparse certainly doesn't expect to see it), so just ignored it for this optimization. If that's unhelpful, please let me know (with code examples, please).

Aug 03 '22 22:08 richardleach

On Wed, Aug 03, 2022 at 03:20:49PM -0700, Richard Leach wrote:

This commit introduces a new OP to replace cases of OP_ANONLIST and OP_ANONHASH where there are zero elements, which is very common in Perl code.

I like this idea. I also think it can be taken further. Here are my observations.

Name: how about OP_EMPTYAVHV.

(I don't like 'MT' - had no idea it was short for 'empty').

A further optimisation might be, for the case of

[my] $x = {};

to skip the padmy and sassign, and set op_targ to be the reference directly. I.e. something along the lines of this pseudocode:

SV *rv;
SV *sv = newSV_type(SVt_PVAV // SVt_PVHV);

if (OPpTARGET_MY) {
    rv = PL_curpad[op_targ];
    /* coerce rv to be RV-capable, then ...*/
    SvRV_set(rv, sv);
    if (OPpLVAL_INTRO) /* is 'my' declaration */
        save_clearsv(op_targ);

    if (G_VOID)
        return; /* skip extending and pushing */
}
else {
    rv = newSV_type_mortal(SVt_IV);
    /* .... */
    SvRV_set(rv, sv);
}
XPUSHs(rv);

Since the op_targ is very likely to be an undef SVt_IV from a previous iteration, converting it to a live RV can typically be special-cased.

OPf_SPECIAL: an easy way to see if/where the OPf_SPECIAL flag isn't used in pp_anaonlist/hash() is to add a temporary assert at the start of both functions, like

assert(PL_op->op_flags & OPf_SPECIAL);

then run it and see what breaks. In this case, it's used (at least) by the constant-folder to fold a constant list, as in e.g. @a[1..5] where the (1..5) gets converted into the constant array (1,2,3,4,5) and the AV is stored directly in the CONST op, rather than a ref to it.

But in this case, only optimising the OPf_SPECIAL case looks right.

The OPpANONTYPE flag: for flags associated with just one op, they're typically named as OPp[name of op]_[description of flag] So for this, I'd suggest OPpEMPTYAVHV_IS_HV. I suggest _IS_HV rather than _TYPE, since then whether the flag being on indicates AV or HV is built into the name.

Don't test the opcode structure in ext/B/t/optree_samples.t; that tends to require fixing up any time there's a minor change in an unrelated opcode. It's easier and more robust to add something to t/perf/opcount.t - this test file checks for certain opcodes being present (or absent) in the right quantity. For example with $a[0] it checks that there are zero alem's and one aelemfast, confirming that the optimisation is present.

-- Modern art: "That's easy, I could have done that!" "Ah, but you didn't!"

Aug 04 '22 11:08 iabyn

On Thu, Aug 04, 2022 at 04:41:13AM -0700, iabyn wrote:

On Wed, Aug 03, 2022 at 03:20:49PM -0700, Richard Leach wrote:

This commit introduces a new OP to replace cases of OP_ANONLIST and OP_ANONHASH where there are zero elements, which is very common in Perl code.

And one last thing. Add a few entries to t/perf/benchmarks: at a minimum,

code => '$x = []'
setup => 'my $x', code => '$x = []'
code => 'my $x = []'

and ditto for {}.

Then run Porting/bench.pl on pre- and post- perls.

-- More than any other time in history, mankind faces a crossroads. One path leads to despair and utter hopelessness. The other, to total extinction. Let us pray we have the wisdom to choose correctly. -- Woody Allen

Aug 04 '22 11:08 iabyn

Thanks for the helpful comments, @iabyn. Think I've actioned all of them, apart from running bench.pl, which I'll do once the pp_ function is finalised.

Besides a general re-review, two specific new things I'd appreciate you taking a look at

I wasn't sure how concise "coerce rv to be RV-capable" can safely be. The current code passes all tests, but...that doesn't mean it's okay.
I haven't figured out how to do without the mortal RV (refsv in pp_emptyavhv) when doing a lexical assignment, since leaving it out causes a leakage in t/op/svleak.t.

Aug 05 '22 20:08 richardleach

On Fri, Aug 05, 2022 at 01:07:12PM -0700, Richard Leach wrote:

Thanks for the helpful comments, @iabyn. Think I've actioned all of them, apart from running bench.pl, which I'll do once the pp_ function is finalised.

Personally I add tests to benchmarks at the start of optimisation work, then run Porting/bench.pl (on just the selected tests) frequently during development. Since its very fast (takes just a couple of minutes with a suitable -j N arg) and 100% reproducible - it's running a cachgrind simulation of the CPU - it can quickly spot mistakes - e.g. "why has the number of branch executions just gone up by 1 after I tweaked pp_foo()?" It can save the results of a run in a data file; then later you can compare several such files to see how things are going (up and down). E.g. do a run and have a file for every significant commit.

I wasn't sure how concise "coerce rv to be RV-capable" can safely be. The current code passes all tests, but...that doesn't mean it's okay.

I think you've unrolled too much there - it should be just a quick check for a simple common special-case and if not, fallback to the slow SvSetMagicSV() or whatever. No need to unroll SvSetMagicSV() itself.

Thinking about it further, the 3 common cases to expect are:

1) first iteration using the pad var:

    condition: SVt_NULL 

2) second+ iteration if the variable goes out of scope between
iterations; e.g.:

    for (...) { my $x = {}; ... }
    sub f { my $x; ...; $x = {}; .... }

    condition: SVt_IV && !SvOK

3) second+ iteration if the variable doesn't go out scope; e.g.:

    my $x; for (...) { ...; $x = {}; ... }

    condition: SVt_IV && SvROK

Any other combo and the programmer is doing something weird - e.g. using a variable initially as a string, then as a reference.

I think it's reasonable to short-cut just for case 2: The container is already set up to hold an RV, it can't have any magic weirdness, and it doesn't have an old RV value which needs freeing first.

NB: in your code, you're testing the SV type against OP_NULL rather than SVt_NULL.

I haven't figured out how to do without the mortal RV (refsv in pp_emptyavhv) when doing a lexical assignment, since leaving it out causes a leakage in t/op/svleak.t. ` Well, creating a mortal in the assignment case is completely wrong and defeats most of the point of the OPpTARGET_MY optimisation! It's difficult to comment much more without seeing the original failing code. But, in principle, doing

sv = newSV_type(SVt_PVAV);

creates an orphaned SV with a reference count of 1. If the code exits or dies at that moment, it will leak. So it needs to be attached somewhere. Turning the lexical var pointed to by op_targ into an RV and pointing it at sv is an example of attaching it. Normally when creating an RV you increment the ref count of the thing being pointed at, but in this case there's no need to do so as sv already has a refcount that's "one too high".

It's possible that you had a false positive in your t/op/svleak.t test. The number of SVs should go up by 1 because an anon array/hash is being created. The important thing is that it doesn't go up by 10+ on 10 iterations.

In t/perf/opcount.t it would be better to have separate tests for each of the {}, $x={}, my $x; $x={}, my $x={} etc.

In creating the array/hash, wouldn't it be slightly more efficient to select just the type rather than having two newSV_type() calls? i.e.

SV * const sv = MUTABLE_SV(newSV_type( (op->op_private & OPpEMPTYAVHV_IS_HV) ? SVt_PVHV : SVt_PVAV ));

I haven't reviewed your code in rpeep() nor in B::Deparse.

-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall

Aug 08 '22 11:08 iabyn

Thanks for the updated comments. I'll try to do an updated push later this week. FWIW, the test in t/op/svleak.t which fails if the refsv mortal isn't created is this one:

{ # broken by 304474c3, fixed by cefd5c7c, but didn't seem to cause
  # any other test failures
  # base test case from ribasushi (Peter Rabbitson)
  no warnings 'experimental::builtin';
  use builtin 'weaken';
  my $weak;
  {
    $weak = my $in = {};
    weaken($weak);
    my $out = { in => $in, in => undef }
  }
  ok(!$weak, "hash referenced weakened SV released");
}

Aug 08 '22 23:08 richardleach

On Mon, Aug 08, 2022 at 04:04:21PM -0700, Richard Leach wrote:

FWIW, the test in t/op/svleak.t which fails if the refsv mortal isn't created is this one:

{ # broken by 304474c3, fixed by cefd5c7c, but didn't seem to cause
  # any other test failures
  # base test case from ribasushi (Peter Rabbitson)
  no warnings 'experimental::builtin';
  use builtin 'weaken';
  my $weak;
  {
    $weak = my $in = {};
    weaken($weak);
    my $out = { in => $in, in => undef }
  }
  ok(!$weak, "hash referenced weakened SV released");
}

That's mainly testing that assigning two values to the same hash key in sequence decrements the ref count of the first value. It's not obvious to me why that's the only test that broke.

-- Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.

Aug 09 '22 09:08 iabyn

Ok, think I sorted out the reference counting.

t/perf/benchmark differences are as follows:


expr::sassign::anonlist
$x = []

        blead emptyavhv
       ------ ---------
    Ir 100.00    120.31
    Dr 100.00    120.16
    Dw 100.00    119.27
  COND 100.00    111.16
   IND 100.00    125.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

expr::sassign::anonlist_lex
$x = []

        blead emptyavhv
       ------ ---------
    Ir 100.00    120.18
    Dr 100.00    120.14
    Dw 100.00    119.35
  COND 100.00    111.11
   IND 100.00    125.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

expr::sassign::my_anonlist_lex
my $x = []

        blead emptyavhv
       ------ ---------
    Ir 100.00    114.49
    Dr 100.00    115.47
    Dw 100.00    114.88
  COND 100.00    108.60
   IND 100.00    120.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

expr::sassign::anonhash
$x = {}

        blead emptyavhv
       ------ ---------
    Ir 100.00    116.33
    Dr 100.00    115.99
    Dw 100.00    114.36
  COND 100.00    110.38
   IND 100.00    125.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

expr::sassign::anonhash_lex
$x = {}

        blead emptyavhv
       ------ ---------
    Ir 100.00    116.25
    Dr 100.00    115.98
    Dw 100.00    114.41
  COND 100.00    110.34
   IND 100.00    125.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

expr::sassign::my_anonhash_lex
my $x = {}

        blead emptyavhv
       ------ ---------
    Ir 100.00    112.28
    Dr 100.00    112.80
    Dw 100.00    111.51
  COND 100.00    108.33
   IND 100.00    120.00

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

AVERAGE

        blead emptyavhv
       ------ ---------
    Ir 100.00    116.57
    Dr 100.00    116.70
    Dw 100.00    115.56
  COND 100.00    109.98
   IND 100.00    123.29

COND_m 100.00    100.00
 IND_m 100.00    125.00

 Ir_m1 100.00    100.00
 Dr_m1 100.00    100.00
 Dw_m1 100.00    100.00

 Ir_mm 100.00    100.00
 Dr_mm 100.00    100.00
 Dw_mm 100.00    100.00

Aug 10 '22 10:08 richardleach

@iabyn / @demerphq - any more comments on this one? (Deparse.pm will need a version bump, but I'm saving that until it's definitely ready to go.)

Aug 17 '22 19:08 richardleach

Rebased after #20077 merge.

Aug 25 '22 23:08 richardleach

Rebased after https://github.com/Perl/perl5/commit/aafefcb90183e1d6ef62d9e1ccc1fae7fcdf9c8e merge.

Sep 08 '22 20:09 richardleach

@iabyn / @demerphq - I'll merge this within a few days unless you'd prefer me to hold off for further review.

Sep 17 '22 22:09 richardleach

I can't test anything till Thursday. I'm comfortable with whatever you choose. Sorry about that.

Yves

On Sat, 17 Sept 2022, 23:15 Richard Leach, @.***> wrote:

@iabyn https://github.com/iabyn / @demerphq https://github.com/demerphq - I'll merge this within a few days unless you'd prefer me to hold off for further review.

— Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/pull/20036#issuecomment-1250148403, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R4IT6OLVF5ZD3H2HUTV6Y7H5ANCNFSM55QNVBWA . You are receiving this because you were mentioned.Message ID: @.***>

Sep 18 '22 18:09 demerphq

Unless anyone has objections, I'd like to merge this before the next dev release in just over a week's time.

Oct 11 '22 23:10 richardleach

You need a version bump. It looks ok to me.

Oct 23 '22 16:10 demerphq

Thanks. lib/B/op_private.pm seems to need a bump as part of the v.5.37.5 release, so I'll hold off until that has happened.

Oct 23 '22 21:10 richardleach

@iabyn / @demerphq - I'll merge this within a few days unless you'd prefer me to hold off for further review.

Apologies @richardleach please rebase and force push to your branch. This should correct the issue.

Oct 24 '22 17:10 toddr

Thanks @toddr. Merging.

Oct 24 '22 22:10 richardleach

On Sat, Sep 17, 2022 at 03:15:26PM -0700, Richard Leach wrote:

@iabyn / @demerphq - I'll merge this within a few days unless you'd prefer me to hold off for further review.

Sorry, I didn't get round to doing this. In short, lools good, apart from these nits:

the op has OPf_SPECIAL flag set - this is a hangover from when the op started out as an OP_ANONLIST/HASH. It aught really to be cleared when converted into an OP_EMPTYAVHV

The special-case code for when the SV is SVt_IV just assumes all other flags can be blown away. This is probably ok at the moment, but might change. Best to put in an assert to see if any unexpected flags are set:

    if (SvTYPE(rv) == SVt_IV && !SvOK(rv)) {
        assert((SvFLAGS(rv) & ~(SVf_IVisUV|SVs_PADSTALE)) == SVt_IV);
        ...

SVs_PADSTALE mat be set but will get blown away anyway by save_clearsv() in the OPpLVAL_INTRO case, and otherwise its a likely a bug if its set - but harmless to unset.

It might be worth adding a couple of tests for state $r = {}; such as checking that the second time round a loop the contents of %$r haven't been lost.

Might be worth adding in a few deparse tests for the various /my/state $x = .... permutations

-- The optimist believes that he lives in the best of all possible worlds. As does the pessimist.

Nov 07 '22 10:11 iabyn

Sorry, I didn't get round to doing this. In short, lools good, apart from these nits:

Thanks, I've started working on them.

Nov 20 '22 00:11 richardleach

perl5 perl5 copied to clipboard

OP_EMPTYAVHV - optimized empty ANONLIST/ANONHASH

perl5
perl5 copied to clipboard