perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Add OPpTARGET_MY optimization to OP_UNDEF

Open richardleach opened this issue 3 years ago • 0 comments

This allows the existing undef OP to act on a pad SV. The following two cases are optimized:

undef my $x, currently implemented as:

    4     <1> undef vK/1 ->5
    3        <0> padsv[$x:1,2] sRM/LVINTRO ->4

my $a = undef, currently implemented as:

    5     <2> sassign vKS/2 ->6
    3        <0> undef s ->4
    4        <0> padsv[$x:1,2] sRM*/LVINTRO ->5

These are now just represented as: 3 <1> undef[$x:1,2] vK/LVINTRO,TARGMY ->4

The undef $x case gets a slight performance boost, as shown in this toy example: my $x; for (0..10_000_000) { undef $x; undef $x; undef $x; undef $x; undef $x; undef $x; undef $x; undef $x; undef $x; undef $x } Blead:

            714.52 msec task-clock                #    0.956 CPUs utilized
                 1      context-switches          #    0.001 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               198      page-faults               #    0.277 K/sec
     3,170,725,221      cycles                    #    4.438 GHz
         5,948,953      stalled-cycles-frontend   #    0.19% frontend cycles idle
           231,911      stalled-cycles-backend    #    0.01% backend cycles idle
    10,843,683,798      instructions              #    3.42  insn per cycle
                                                  #    0.00  stalled cycles per insn
     2,260,727,959      branches                  # 3163.998 M/sec

Patched:

            602.19 msec task-clock                #    0.974 CPUs utilized
                 1      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               199      page-faults               #    0.330 K/sec
     2,724,503,940      cycles                    #    4.524 GHz
           313,679      stalled-cycles-frontend   #    0.01% frontend cycles idle
           191,871      stalled-cycles-backend    #    0.01% backend cycles idle
     8,943,649,796      instructions              #    3.28  insn per cycle
                                                  #    0.00  stalled cycles per insn
     1,760,721,729      branches                  # 2923.875 M/sec

The $x = undef case, in which more optimization is achieved, performs much better, as shown in this toy example: my $x; for (0..10_000_000) { $x = undef; $x = undef; $x = undef; $x = undef; $x = undef; $x = undef; $x = undef; $x = undef; $x = undef; $x = undef }

Blead:
          1,224.48 msec task-clock                #    0.990 CPUs utilized
                 2      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               205      page-faults               #    0.167 K/sec
     5,663,539,065      cycles                    #    4.625 GHz
         4,336,701      stalled-cycles-frontend   #    0.08% frontend cycles idle
             4,385      stalled-cycles-backend    #    0.00% backend cycles idle
    19,644,168,889      instructions              #    3.47  insn per cycle
                                                  #    0.00  stalled cycles per insn
     3,760,824,142      branches                  # 3071.356 M/sec

Patched:

            628.51 msec task-clock                #    0.974 CPUs utilized
                 1      context-switches          #    0.002 K/sec
                 0      cpu-migrations            #    0.000 K/sec
               200      page-faults               #    0.318 K/sec
     2,716,312,069      cycles                    #    4.322 GHz
           432,725      stalled-cycles-frontend   #    0.02% frontend cycles idle
           224,782      stalled-cycles-backend    #    0.01% backend cycles idle
     8,943,691,044      instructions              #    3.29  insn per cycle
                                                  #    0.00  stalled cycles per insn
     1,760,729,375      branches                  # 2801.443 M/sec

Also some bench.pl comparisons:

expr::sassign::undef_lex
$x = undef

        blead  undef
       ------ ------
    Ir 100.00 318.00
    Dr 100.00 317.65
    Dw 100.00 412.50
  COND 100.00 300.00
   IND 100.00 150.00

COND_m 100.00 100.00
 IND_m 100.00 200.00

 Ir_m1 100.00 100.00
 Dr_m1 100.00 100.00
 Dw_m1 100.00 100.00

 Ir_mm 100.00 100.00
 Dr_mm 100.00 100.00
 Dw_mm 100.00 100.00

expr::sassign::undef_lex_direc
undef $x

        blead  undef
       ------ ------
    Ir 100.00 142.00
    Dr 100.00 158.82
    Dw 100.00 162.50
  COND 100.00 142.86
   IND 100.00 150.00

COND_m 100.00 100.00
 IND_m 100.00 150.00

 Ir_m1 100.00 100.00
 Dr_m1 100.00 100.00
 Dw_m1 100.00 100.00

 Ir_mm 100.00 100.00
 Dr_mm 100.00 100.00
 Dw_mm 100.00 100.00

expr::sassign::undef_my_lex
my $x = undef

        blead  undef
       ------ ------
    Ir 100.00 164.50
    Dr 100.00 180.85
    Dw 100.00 196.15
  COND 100.00 173.68
   IND 100.00 133.33

COND_m 100.00 100.00
 IND_m 100.00 200.00

 Ir_m1 100.00 100.00
 Dr_m1 100.00 100.00
 Dw_m1 100.00 100.00

 Ir_mm 100.00 100.00
 Dr_mm 100.00 100.00
 Dw_mm 100.00 100.00

expr::sassign::undef_my_lex_direc
undef my $x

        blead  undef
       ------ ------
    Ir 100.00 112.43
    Dr 100.00 123.40
    Dw 100.00 119.23
  COND 100.00 115.79
   IND 100.00 133.33

COND_m 100.00 100.00
 IND_m 100.00 150.00

 Ir_m1 100.00 100.00
 Dr_m1 100.00 100.00
 Dw_m1 100.00 100.00

 Ir_mm 100.00 100.00
 Dr_mm 100.00 100.00
 Dw_mm 100.00 100.00

richardleach avatar Aug 10 '22 16:08 richardleach