truffleruby Weird error instead of LocalJumpError

The following script (when run in a file) has a different behavior in rtuffleRuby vs (both MRI and JRuby)

def bar
  proc { return 42 }
end

begin
  puts bar.call.to_s
rescue LocalJumpError
  puts 'HI'
end

On MRI & JRuby, this will output HI. On TruffleRuby, I get a weird error that doesn't seem to be a regular exception.

truffleruby: org.graalvm.polyglot.PolyglotException: org.truffleruby.language.control.ReturnException
Original Internal Error: 
org.truffleruby.language.control.ReturnException

truffleruby 1.0.0-rc9, like ruby 2.4.4, GraalVM CE Native [x86_64-linux] On Ubuntu 16.04, TruffleRuby installed through RVM

Nov 29 '18 20:11 MaxLap

@eregon can you please reassign to @LillianZ unless you're planning to fix immediately.

Aug 04 '20 20:08 chrisseaton

Discussion from Slack:

The current FrameOnStackMarker is tailored for break, so for (1)foo { break }(2) it's created in 1, and marked as no longer on stack in 2 by FrameOnStackNode (whether foo called the block or not). It's not actually whether the block frame is on stack, it's whether the block is called during that call from the foo call site (since break exits that call site).

[1] pry(main)> def m; yield_self { break 42 }; end; m
=> 42
[2] pry(main)> def m; Proc.new { break 42 }.call; end; m
LocalJumpError: break from proc-closure
from (pry):2:in `block in m'
[3] pry(main)> def m; yield_self { return 42 }; end; m
=> 42
[4] pry(main)> def m; Proc.new { return 42 }.call; end; m
=> 42
[5] pry(main)> def m; b = Proc.new { return 42 }; b.call; end; m
=> 42

Doing the same logic as for break would be too restrictive, return can be called when break cannot as shown in the pry session above.

We don't have a FrameOnStackMarker for methods currently, we'd need to add that (ideally only when there is a non-local return inside). So we need to keep track in the translator of the "block depth until the first method or lambda" around.

The fact this also applies to lambdas makes everything more complicated (notably lambda{} will already have a FrameOnStackMarker for break even though I think it would never be used once it's sure it's a lambda and not a proc), so I'd suggest getting it right for methods first and then generalize to handle lambdas too (both -> {} and lambda {}).

It might be interesting to look at how CRuby handles this.

Aug 05 '20 09:08 eregon

While playing with this I noticed:

ruby -e 'b=-> { Proc.new { return 42 }; }.call; p b; p b.call'
#<Proc:0x0000563755a91c48 -e:1>
So it returns from the file?

ruby -e 'def m; -> { Proc.new { return 42 }; }.call; end; p b=m; p b.call'
#<Proc:0x00005572d5d857c8 -e:1>
Traceback (most recent call last):
	1: from -e:1:in `<main>'
-e:1:in `block (2 levels) in m': unexpected return (LocalJumpError)

And apparently return inside a block inside a lambda still exits the method, not the lambda, which simplifies the issue:

ruby -e 'def m; -> { Proc.new { return 42 } }.call.call; :after; end; p m'          
42

ruby -e 'def m; p -> { return 3 }.call; :after; end; p m'         
3
:after

So return directly inside a lambda return from the lambda, but nested in a block returns from the surrounding method?

Aug 05 '20 09:08 eregon

Super interesting edge-case finding, I love those :) I think that last line is not accurate, or at least misleading?

ruby -e 'def m; -> { Proc.new { return 42 }.call }.call; :after; end; p m'
:after

If the nested block is called from inside the lambda, it returns only from the lambda. Your example calls the block once its out of the lambda. I'm quite surprised as I would have thought there would be a LocalJumpError.

Aug 05 '20 11:08 MaxLap

Oh wow, this is madness. It seems a return inside a Proc inside a -> {} lambda returns either to the lambda or the method, depending if the lambda is still on the stack? A slightly clearer variant of my version above:

def m
  b = -> { Proc.new { return 42 } }.call
  p b
  b.call # returns from the method
  p :after # never reached!
end

p m
# =>
#<Proc:0x00005556d9f39000@-:2>
42

And your version:

def m
  p -> {
    Proc.new { return 42 }.call # returns from the lambda
    :after_in_lambda
  }.call # => 42
  :after_in_method
end

p m # => :after_in_method

Which makes me think we should fix the "sane" case first, that is a return inside a block inside a method (no lambda involved).

Aug 05 '20 13:08 eregon

Which means we can construct an example where a single return in the source code might return to 2 different lexical places (which seems wrong to me, AFAIK no other control flow language construct violates that rule, they always jump to a single place):

def m(call_proc)
  r = -> {
    # This single return in the source might exit the lambda or the method!
    proc = Proc.new { return :return }

    if call_proc
      proc.call
      :after_in_lambda
    else
      proc
    end
  }.call # returns here if call_proc

  if call_proc
    [:after_in_method, r]
  else
    r.call
    :never_reached
  end
end


p m(true)  # => [:after_in_method, :return]
p m(false) # :return

Aug 05 '20 13:08 eregon

Haha, here is another one, no lambda involved! (at least not directly ;)). The proc is made in the same "stack", but called at different places.

class A
  def self.meta1
    define_method :m do
      proc { return 42 }
    end
    A.new.m.call
    :after
  end
  p meta1 #=> 42

  def self.meta2
    define_method :m do
      proc { return 43 }.call
    end
    A.new.m
    :after
  end
  p meta2 #=> :after
end

The logic I see overall, is that all of the scopes accessible from the place where the proc is created are allowed to return with it (Guess it stops at the first scope it encounters that can return with it?)

Here is another fun one!

def m
  zz = Proc.new { return 45 }
  b = lambda { zz.call }
  b.call
  :after
end
p m #=> 45

The call to the proc literally ignores the lambda which is on the stack and basically "goto" out of it.

That's how I would summarize all of this:

When a proc is created, all of the scopes accessible from that place (lexically) which are methods or lambda are acceptable target to return from.
The only difference between being in a lambda or in a method, is that the lambda still has access to the scope of the method, and so the proc created can return from both the lambda and the method. (There could be many nested lambda, but always only one method) (This is the same issue with define_method, which keeps access to the parent's scope)
The return can skip call stacks which the proc doesn't have access to to jump to one that is further but is accessible from the proc. (That is excepted from just basic usage of blocks when it is forwarded deeper)
LocalJumpErrors only happen if there is nowhere in the callstack where the proc still has access to

Good luck!

Aug 05 '20 18:08 MaxLap

I filed https://bugs.ruby-lang.org/issues/17105 in an attempt to clarify the behavior. While it is fun to play with edge cases, in practice these very complicated semantics harm understanding of what return does, and might have significant implications on performance.

Aug 06 '20 10:08 eregon

i think this is by design in MRI:

https://www.rubyguides.com/2016/02/ruby-procs-and-lambdas/

A lambda will return normally, like a regular method. But a proc will try to return from the current context.

Procs return from the current method, while lambdas return from the lambda itself.

Aug 06 '20 14:08 Hanmac

I think we're all agreed on those basics of what return from a proc and lambda normally does - I think we're beyond that and talking about some edge-cases here which are more complex than that explanation.

Aug 06 '20 14:08 chrisseaton

truffleruby truffleruby copied to clipboard

Weird error instead of LocalJumpError

truffleruby
truffleruby copied to clipboard