perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Fix entersub evaluation order

Open leonerd opened this issue 2 years ago • 10 comments

Evaluate the LHS expression giving the code ref, before evaluating the RHS expression giving the args.

This change does introduce a new meaning of the OPf_SPECIAL flag to the OP_ENTERSUB op, which might confuse thirdparty optree walking modules.

An alternative idea which could be explored instead, is to evaluate the LHS into a pad temporary first; making such syntax equivalent to a rewrite:

EXPR->(ARGS)
do { my $tmp = EXPR; $tmp->(ARGS) }

This would make the optree and the pad a little bit larger though.

Fixes #19997

leonerd avatar Aug 01 '22 16:08 leonerd

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class)

my $meth = $obj->can('method');
$meth->($arg1, $arg2);

With the current code it's possible to peek at the top of the stack to determine whether or not it's 'ours' call site (so should the optree optimisation be kept or not). As I see for normal method calls, aka $obj->method(...), it's still the case, but for cases like this the object is now at the bottom of the stack with no easy access. Furthermore, it makes such calls slower for no reason by introducing stack copying.

And the case this tries to fix has never been reported in all those years, so compared to DBIx::Class it's a bit less used.

dur-randir avatar Aug 03 '22 12:08 dur-randir

Hi @leonerd! Maybe I'm doing something wrong, but after installing your branch I am still seeing the same behavior:

tmp/perl5  [entersub-evaluation-order 
▶ ./perl -v
This is perl 5, version 37, subversion 3 (v5.37.3 (v5.34.0-RC1-2556-g09b4aff6b9)) built for darwin-2level

▶ ./perl -Ilib -E 'my $x; $x->(die q(oops!))'
oops! at -e line 1.

▶ ./perl -Ilib -E 'does_not_exist(die q(oops!))'
oops! at -e line 1.

garu avatar Aug 06 '22 05:08 garu

On Sat, 6 Aug 2022 at 07:00, Breno G. de Oliveira @.***> wrote:

Hi @leonerd https://github.com/leonerd! Maybe I'm doing something wrong, but after installing your branch I am still seeing the same behavior:

tmp/perl5 [entersub-evaluation-order

▶ ./perl -v

This is perl 5, version 37, subversion 3 (v5.37.3 (v5.34.0-RC1-2556-g09b4aff6b9)) built for darwin-2level

That doesnt look right to me. Why does it say two versions there?

Yves

demerphq avatar Aug 06 '22 09:08 demerphq

That doesnt look right to me. Why does it say two versions there?

I don't know. All I did was:

$ git clone https://github.com/leonerd/perl5.git
$ cd perl5
$ git checkout entersub-evaluation-order
$ sh Configure -de -Dusedevel
$ make
$ make test
$ ./perl -v
This is perl 5, version 37, subversion 3 (v5.37.3 (v5.34.0-RC1-2556-g09b4aff6b9)) built for darwin-2level
(...)
$ ./perl -Ilib -E 'my $x; $x->(die 42)'
42 at -e line 1.

What did I miss?

garu avatar Aug 08 '22 17:08 garu

v5.34.0-RC1-2556-g09b4aff6b9 is the git describe output for the commit being built. The name it uses is based on the closest tag available. @leonerd's fork doesn't have all of the tags. The latest it has is v5.34.0-RC1.

haarg avatar Aug 08 '22 17:08 haarg

$ ./perl -Ilib -E 'my $x; $x->(die 42)'
42 at -e line 1.

What did I miss?

I wouldn't expect that to behave any differently in my branch. It evaluates the CV (gets undef) then evaluates the args (which throws). It hasn't got as far as checking for definedness of the CV, because the evaluation already died.

My branch does affect the behaviour of code like this:

(do { die "abc" })->(die "def")

On my branch it fails "abc"; on current bleadperl it fails "def".

leonerd avatar Aug 08 '22 17:08 leonerd

On Mon, 8 Aug 2022 at 19:54, Paul Evans @.***> wrote:

$ ./perl -Ilib -E 'my $x; $x->(die 42)' 42 at -e line 1.

What did I miss?

I wouldn't expect that to behave any differently in my branch. It evaluates the CV (gets undef) then evaluates the args (which throws). It hasn't got as far as checking for definedness of the CV, because the evaluation already died.

Wasn't part of the point to ensure there is something to pass the arguments to before we evaluate the arguments?

cheers, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq avatar Aug 08 '22 19:08 demerphq

Wasn't part of the point to ensure there is something to pass the arguments to before we evaluate the arguments?

Partly. That could be a useful next step. The first step was just to get them evaluated in the right order so that kind of thing could be checked. This was also a first step towards having the optional chaining operator work properly here; ensuring it doesn't needlessly evaluate arguments if the CV invocant is undef.

leonerd avatar Aug 09 '22 14:08 leonerd

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class)

@dur-randir does DBIx::Class (or relevant dependencies) provide a test that fails when building it with this branch? It could be used to make sure any changes only get accepted if they don't break it. Apologies in advance if I misunderstood something!

garu avatar Aug 09 '22 14:08 garu

Rebased on current blead. Should be ready for review now.

leonerd avatar Sep 16 '22 18:09 leonerd

Ping all - anyone want to review this?

leonerd avatar Sep 27 '22 15:09 leonerd

For XS accessor generation modules this'd break the following use case (extensively present in DBIx::Class) ...

New devel perl will do things that CPAN modules that poke at optrees will need to be aware of. This is always true

If CPAN modules don't get updated to cope then they won't like 5.38's optrees. But that is already true given a bunch of other changes for undef/emptyhv/padsv_store, etc... This was true in 5.36 because of defer and finally; 5.34 because of try/catch, 5.32 because of isa, ...

With the current code it's possible to peek at the top of the stack to determine whether or not it's 'ours' call site (so should the optree optimisation be kept or not). As I see for normal method calls, aka $obj->method(...), it's still the case, but for cases like this the object is now at the bottom of the stack with no easy access. Furthermore, it makes such calls slower for no reason by introducing stack copying.

It's still easy to find. You just have to consult the markstack as well in these cases - on 5.37.x if the OPf_SPECIAL is set. I'd be happy to suggest relevant changes to any CPAN module which needs updating to cope with this.

leonerd avatar Sep 27 '22 16:09 leonerd

Could you just update the top item on the mark stack (set to svp+1), instead of doing this Move? (I haven't thought this through, just musing.)

That was my first thought. I did that and it worked for the enter'ed sub itself, but then it breaks later on stack cleanup because the pushed CV itself is still on the stack. It isn't noticable in void-returning cases but it gets in the way in scalar- or list-returning ones. That situation could probably be fixed by adjusting the code in stack unwind but then it touches a bunch more places and becomes a lot more subtle in cleanup and CPAN module breakage. I found the Move() solution much simpler.

leonerd avatar Sep 27 '22 16:09 leonerd

That was my first thought. I did that and it worked for the enter'ed sub itself, but then it breaks later on stack cleanup because the pushed CV itself is still on the stack. It isn't noticable in void-returning cases but it gets in the way in scalar- or list-returning ones.

Makes sense. I'm not going to suggest adding a second PUSHMARK, though I guess it could be an option.

richardleach avatar Sep 27 '22 16:09 richardleach

On Tue, Sep 27, 2022 at 03:51:37PM +0000, Paul Evans wrote:

Ping all - anyone want to review this?

I haven't really been paying attention, but...

  1. I'm vaguely nervous that this is touching heavily in areas related to my current and next projects: making the stack reference counted, and doing more signature work (especially eliminating @_), both of which get fairly complex around pp_entersub(). So if possible I'd prefer this work was deferred.

Also, it makes me very twitchy the idea of messing with long-standing behaviour which isn't a bug, and I dislike the idea of adding any more conditionals and other work to the hot pp_entersub() code paths unless really necessary.

-- You never really learn to swear until you learn to drive.

iabyn avatar Sep 29 '22 12:09 iabyn

  1. I'm vaguely nervous that this is touching heavily in areas related to my current and next projects: making the stack reference counted, and doing more signature work (especially eliminating @_), both of which get fairly complex around pp_entersub(). So if possible I'd prefer this work was deferred.

It is a bit of a heavy hit; additionally @tonycoz's findings above on the performance impact of it suggests that this is not a good way to implement it.

Also, it makes me very twitchy the idea of messing with long-standing behaviour which isn't a bug, and I dislike the idea of adding any more conditionals and other work to the hot pp_entersub() code paths unless really necessary.

While it currently seems like a really obscure bug, it becomes quite important when we consider RFC 0021 - Optional Chaning. The idea is to have a new ?-> operator that shortcircuits if the LHS is undefined.

Part of that shortcircuiting means not evaluating any of the righthand side if the lefthand is undefined. For structure accesses such as array and hash element lookups, these can already be easily implemented:

my $x = $aref?->[ gen_index() ];
my $y = $href?->{ gen_key() };

In these cases, the function inside the brackets does not get called if the reference on the LHS is undefined; the whole thing shortcircuits nicely.

However, in function call cases, the current interpreter implementation first evaluates the arguments to the function before it evaluates the function itself; thus meaning:

my $z = $subref?->( gen_args() );

would always invoke gen_args() even if $subref is not defined. It's an important part of the design of optional chaining that it should skip this, so this won't do.

Now, while we could define an entire new op for this case (OP_ENTERSUB_OPTIONAL?) that takes its arguments in the other order, it does then lead to the "quirk" that given the following two lines of very-similar looking code, they'd actually behave so differently:

my $i = A()->( B() );
my $j = A()?->( B() );

If we don't change the evaluation order of regular entersub, that would result in perl calling the functions B(); A(); A(); B(); which seems Surprising.

This is all why I feel it is important to make entersub run in a "less surprising" manner.

leonerd avatar Sep 30 '22 15:09 leonerd

Plus I should add that all of the comments above about entersub on coderefs also apply to method calls:

my $z = $obj?->meth( gen_args() );

my $i = objA()->meth( B() );
my $j = objA()?->meth( B() );

leonerd avatar Sep 30 '22 15:09 leonerd

Is it worth trying out the alternative suggestion (reproduced below) from the first post, just to see how that performs?

An alternative idea which could be explored instead, is to evaluate the LHS into a pad temporary first; making such syntax equivalent to a rewrite:

EXPR->(ARGS)
do { my $tmp = EXPR; $tmp->(ARGS) }

richardleach avatar Sep 30 '22 15:09 richardleach

While it currently seems like a really obscure bug, it becomes quite important when we consider RFC 0021 - Optional Chaning. The idea is to have a new ?-> operator that shortcircuits if the LHS is undefined.

This could be handled by only doing the swapping for optional chaining, so only optional chaining pays the cost of the the stack rearrangement (or uses @richardleach's idea with the temp

It would be nice if we could just change the order of arguments to entersub, but I think that would break half of the XS on CPAN.

Now, while we could define an entire new op for this case (OP_ENTERSUB_OPTIONAL?) that takes its arguments in the other order, it does then lead to the "quirk" that given the following two lines of very-similar looking code, they'd actually behave so differently:

my $i = A()->( B() );
my $j = A()?->( B() );

If we don't change the evaluation order of regular entersub, that would result in perl calling the functions B(); A(); A(); B(); which seems Surprising.

I do wonder if changing that evaluation order for the existing operator will result in incompatible changes in behaviour, whether noisy (caught by a test), or silently breaking something.

I can see why consistent and left-to-right* order is desirable, but I'm not sure we can get it without breaking something.

  • perlop claims "In fact Perl has a general rule that the operands of an operator are evaluated in left-to-right order"

tonycoz avatar Oct 06 '22 04:10 tonycoz

This is in conflict and has not been touched for a while. @leonerd what are your plans here?

demerphq avatar Feb 08 '23 09:02 demerphq

This is in conflict and has not been touched for a while. @leonerd what are your plans here?

@leonerd, ^^

jkeenan avatar Jul 31 '23 16:07 jkeenan

In short: I have no idea.

In slightly longer: The underlying problem still exists, and would still get in the way of attempting to create an optional-chaining function or method call operator. I don't know whether the approach taken here would be the best solution or if it's better to attack the problem from some other direction.

leonerd avatar Sep 18 '23 21:09 leonerd

In short: I have no idea.

In slightly longer: The underlying problem still exists, and would still get in the way of attempting to create an optional-chaining function or method call operator. I don't know whether the approach taken here would be the best solution or if it's better to attack the problem from some other direction.

My take on this pull request is that it's not going to go forward. I recommend that we close it, then substitute either a discussion of the issues on the P5P list or the opening of a GH issue with a succinct discussion of the problem informed by the comments in this ticket.

@leonerd

jkeenan avatar Nov 22 '23 22:11 jkeenan