perl5 Implement OP_PADSV_STORE - combined sassign/padsv OP

This commit introduces a new OP to replace simple cases of OP_SASSIGN and OP_PADSV.

For example, my $x = 1 is currently implemented as:

1  <;> nextstate(main 1 -e:1) v:{
2  <$> const(IV 1) s
3  <0> padsv[$x:1,2] sRM*/LVINTRO
4  <2> sassign vKS/2

But now will be turned into:

1  <;> nextstate(main 1 -e:1) v:{
2  <$> const(IV 1) s
3  <1> padsv_store[$x:1,2] vKMS/LVINTRO

This intended to be a transparent performance optimization. It should be applicable for RHS optrees of varying complexity.

Notes:

This optimization was suggested by @leonerd as a good project for getting feet wet.
This was my first foray into Deparse.pm; no idea if this was the most appropriate way of supporting this new op.
Ideas for separate optimizations of simple assignments of CONSTs, PADSVs, and UNDEF into single ops are already in the works.

Performance: The following unrealistic code shows the op-related speedup:

my ($l0, $l1, $l2, $l3, $l3, $l5, $l6, $l7, $l7, $l9);
for (0..10_000_000) {
    $l0 = $l1 = $l2 = $l3 = $l4 = $l5 = $l6 = $l7 = $l8 = $l9 = 1 
}

Blead:

            870.42 msec task-clock                #    0.985 CPUs utilized          
                 1      context-switches          #    0.001 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               202      page-faults               #    0.232 K/sec                  
     3,800,528,668      cycles                    #    4.366 GHz                    
        79,206,368      stalled-cycles-frontend   #    2.08% frontend cycles idle   
           217,927      stalled-cycles-backend    #    0.01% backend cycles idle    
    15,523,885,258      instructions              #    4.08  insn per cycle         
                                                  #    0.01  stalled cycles per insn
     3,030,769,392      branches                  # 3481.972 M/sec                  
                 0      branch-misses             #    0.00% of all branches        

       0.883580599 seconds time elapsed

Patched:

            721.04 msec task-clock                #    0.983 CPUs utilized          
                 1      context-switches          #    0.001 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               197      page-faults               #    0.273 K/sec                  
     3,163,626,846      cycles                    #    4.388 GHz                    
        51,457,402      stalled-cycles-frontend   #    1.63% frontend cycles idle   
                30      stalled-cycles-backend    #    0.00% backend cycles idle    
    12,563,719,698      instructions              #    3.97  insn per cycle         
                                                  #    0.00  stalled cycles per insn
     2,470,736,436      branches                  # 3426.638 M/sec                  
                 0      branch-misses             #    0.00% of all branches        

       0.733857229 seconds time elapsed

The extent to which users will notice a benefit in real code will depend on things like how often the optimized op is used and what path is taken through Perl_sv_setsv.

Jul 11 '22 21:07 richardleach

Any further comments on this one, or does it look good to merge?

Jul 31 '22 11:07 richardleach

Thanks for spotting that. I'll take a look later today. Might not be straightforward - regen complains that: addbits(): bit 4 of padsv_store already specified (OPpTARGET_MY) at ./regen/op_private line 417.

(I can't remember offhand if B::Deparse needs the OPpTARGET_MY flag.)

Aug 01 '22 16:08 richardleach

Assuming all CI tests pass, OP_PADSV_CONST doesn't seem to need the OPpTARGET_MY flag, so I've taken that out and added in the OPpDEREF flag bits.

Aug 01 '22 18:08 richardleach

After working on #20036 and looking a bit more at this PR, I realised that the basic optimization will not need to support OPpDEREF and have removed that branch from pp_padsv_store.

Note: It also seemed simple enough to add support for assigning simple padsv-padsv-sassign optrees, which I originally did. However, this involved turning OP_PADSV_STORE into an UNOP_aux and stashing the second padoffset in the op_aux slot. That worked fine, but there was no way to make Deparse.pm work (short of hacking around in B.xs, which definitely seems like a bad move.) So I dropped that commit.

Hopefully, UNOPs (and possibly other op types) will soon get an op_src member, at which point the padsv-padsv-sassign commit can be tweaked to use it and submitted as a separate PR.

Aug 06 '22 15:08 richardleach

perl5 perl5 copied to clipboard

Implement OP_PADSV_STORE - combined sassign/padsv OP

perl5
perl5 copied to clipboard