perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Perl_newSLICEOP: Optimise '(caller)[0]' into 'scalar caller'

Open richardleach opened this issue 6 months ago • 6 comments

A subroutine can obtain just the package of its caller in a couple of ways. Both seem somewhat common.

  • caller - in scaler context, as in my $x = caller;
  • (caller)[0], as in my $x = (caller)[0];

In the first, caller finds the package name, sticks it in a new SV, and puts that (or undef) on the stack:

<0> caller[t2] s

In the second, caller (a) finds the package name, filename, and line (b) creates three new SVs to hold them all (c) puts those SVs on the stack (d) does a list slice to leave just the package SV on the stack.

7        <2> lslice sK/2 ->8
-           <1> ex-list lK ->5
3              <0> pushmark s ->4
4              <$> const[IV 0] s ->5
-           <1> ex-list lK ->7
5              <0> pushmark s ->6
6              <0> caller[t2] l ->7

This commit checks for the second case inside Perl_newSLICEOP and instead of constructing a lslice OP, returns just the caller OP with scalar context applied.


  • This set of changes does not require a perldelta entry.

richardleach avatar Jun 12 '25 21:06 richardleach

https://grep.metacpan.org/search?size=20&_bb=86025425&q=%5C%28caller%5C%29%5C%5B&qft=.pm%2C+.t&qd=&qifl=

Should've been done 30 years ago. Most PP devs think (caller())[\w] is constant folded as if Perl is identical to C++. The truth is list context PP context() is like writing a 100MB core dump to a SSD everytime you execute it. caller() shouldve never ever have become Perl best practices/cargo culted. There is nothing wrong with caller's public PP API IMO, but b/c of its horrible runtime internal implementation, it shouldve been sent to the landfill, by use strict; on day 1 of use strict;, just like this Perl 5 code was sent to the landfill by use strict;

C:\Users\Owner>perl -e" push( @a, THISS); push( @a, ISS); push( @a, PERL); $, =' '; print @a;"
THISS ISS PERL
C:\Users\Owner>

Rich, would you pretty please be able to tackle adding OP tree compile time next gen G_VOID propagation to the other 12-15 list context retval indexes/retval SV*s created by pp_caller?

bulk88 avatar Jun 12 '25 23:06 bulk88

Hmmm, if I can co-opt op_private then it's likely possible to cover individual elements 1,2,3 and in-order slices like [1,2], which seem to make up the majority of usage on CPAN. I'll have a go.

richardleach avatar Jun 13 '25 21:06 richardleach

Hmmm, if I can co-opt op_private then it's likely possible to cover individual elements 1,2,3 and in-order slices like [1,2], which seem to make up the majority of usage on CPAN. I'll have a go.

pp_caller() has 3 return prototypes,

-1 SV* -3 SVs -10 SVs

All that is needed is a room to put a U16 variable in OP_CALLER's OP struct, or 10 unused bits somewhere. 10 bits will allow a fast and easy pattern of

if(op->opaque & 0x4) {
    PUSH(sumsv);
}
if(op->opaque & 0x8) {
    PUSH(sumsv);
}
if(op->opaque & 0x10) {
    PUSH(sumsv);
}

inside pp_caller.

My thought 5 years ago was to add a 2nd integer argument, allowing the end use to pick 1 of 10 elements to return, but I didn't like my proposal, since it would only help new code, and heavily policed and evangelized new code. And "heavily policed and evangelized new code" really means @bulk88 is making PRs to various CPAN modules and demanding those authors make a new CPAN release tar.gz for @bulk88's brand new perl 5 grammer enhancement authored by @bulk88.

My wishlist quickly changed to, the existing production in the field deployed perl code needs to be left un-touched, the correct fix would be from the P5P side, from the 'yylex'/'ck_op_()' side by analyzing the scalar or list context = operator and the target lvalue somehow, and eventually see if it was assigned to a $/@ lvalue or to an anonymous array, array ref, whatever this is my $x = (caller)[0];.

7        <2> lslice sK/2 ->8
-           <1> ex-list lK ->5
3              <0> pushmark s ->4
4              <$> const[IV 0] s ->5
-           <1> ex-list lK ->7

I don't know how ex-list and lslice pp_foo() funcs work from the top of my head, I very rarely or never see them flash by while holding F11 or breakpointing each cycle of runops_std(). But back to my point, the wish list is to extract that array dereference const literal integer from [] operator's OP, and stick it into caller() operator's OP.

bulk88 avatar Jun 13 '25 22:06 bulk88

_-10 SV_s

I didn't do an exhaustive grep of CPAN on the train this morning, but it did seem like handling only the first 4 SVs would cover the vast majority of actually-encounted slice cases. Hence abuse of op_private seemed like it might be enough.

richardleach avatar Jun 14 '25 00:06 richardleach

_-10 SV_s

I didn't do an exhaustive grep of CPAN on the train this morning, but it did seem like handling only the first 4 SVs would cover the vast majority of actually-encounted slice cases. Hence abuse of op_private seemed like it might be enough.

values 0-9 aka 10 SVs fit in 4 bits. Picking 1, 2 , 3 or more non linear SVs out of 10, requires 10 bits of space.

IDK how to read all the lines below, but my eyes see worst case ever 3 bits free, at minimum case a U8 available, getting creative steal some bits from U8 op_flags, after that add another type code inside PERL_BITFIELD16 op_type:9;, after that steal PADOFFSET op_targ;, im not sure a caller in list context, can have a TARG since isn't TARG only for caller() in scalar context , and then TARG is the lvalue on the left side caller() ? Im guessing but not checking the src code to verify, that caller() always pulls its 1 and only incoming arg, which is an IV/U32 of how many PP frames to go backwards, from the PL stack, not from caller() OP's TARG, correct me if I am wrong.

caller		caller			ck_fun		t%	S?
# baseop/unop - %
#define BASEOP				\
    OP*		op_next;		\
    OP*		op_sibparent;		\
    OP*		(*op_ppaddr)(pTHX);	\
    PADOFFSET	op_targ;		\
    PERL_BITFIELD16 op_type:9;		\
    PERL_BITFIELD16 op_opt:1;		\
    PERL_BITFIELD16 op_slabbed:1;	\
    PERL_BITFIELD16 op_savefree:1;	\
    PERL_BITFIELD16 op_static:1;	\
    PERL_BITFIELD16 op_folded:1;	\
    PERL_BITFIELD16 op_moresib:1;       \
    PERL_BITFIELD16 op_spare:1;		\
    U8		op_flags;		\
    U8		op_private;
#endif
    /* CALLER     */ (OPpARG4_MASK|OPpOFFBYONE),
#define OPpARG4_MASK            0x0f
#define OPpOFFBYONE             0x80

There is also the sneaky solution of pp_caller() at runloop time doing a sneaky deref into a const folded/disabled but not defragmented OP*, and learning the const literal integer from a different OP* struct than its own OP* struct. Certain other pp_*() funcs do this design pattern already.

bulk88 avatar Jun 16 '25 08:06 bulk88

values 0-9 aka 10 SV_s fit in 4 bits. Picking 1, 2 , 3 or more non linear SV_s out of 10, requires 10 bits of space.

Yeah, but (caller)[1,2] seems to crop up quite a lot, whereas I couldn't spot a (caller)[8] for example. IIRC, these were the cases I mostly saw from a CPAN grep:

  • (caller)[0]
  • (caller)[1]
  • (caller \d?)[2]
  • (caller \d)[3]
  • (caller \d?)[1,2]
  • (caller)[0,1]
  • (caller)[0,2]

We could have a new unop_aux OP that supports arbitrary element emitting in arbitrary orders, but that feels excessive.

richardleach avatar Jun 16 '25 09:06 richardleach