truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

Truffleruby loses $! after `rb_exc_raise`

Open larskanis opened this issue 3 years ago • 7 comments

While developing ruby-pg I noticed a bug in Truffleruby. I replaced the pg gem by openssl, to use a stdlib:

require "openssl"

begin
  raise StandardError, "aaa"
rescue Exception => err
  begin
    # this works since it generates an ArgumentError before calling into C-ext:
    # OpenSSL::Random.random_bytes
    # but this loses the exception context, since it raises ArgumentError from a C-ext:
    OpenSSL::Random.random_bytes(-1)
  rescue ArgumentError
  end

  # should raise StandardError "aaa" but raises RuntimeError ""
  raise
  # this works and raises the StandardError "aaa"
  # raise err
end

It looks to me like a call to rb_exc_raise or sibling in a C-extension clear the exception context, so that the second raise in the sample above acts like an ordinary raise outside of the rescue branch.

Context:

$ ruby -v
truffleruby 22.3.1, like ruby 3.0.3, GraalVM CE Native [x86_64-linux]

larskanis avatar Feb 24 '23 16:02 larskanis

Thank you for the report. I had a quick look and I observed $! isn't set just before the raise no-args. And interestingly it does work fine when using raise ArgumentError, "ruby raise" instead of OpenSSL::Random.random_bytes(-1).

I'll debug it further, I thought it's simply rb_define_method missing an ensure but that doesn't seem to fix it.

eregon avatar Mar 03 '23 18:03 eregon

This seems caused by this change: https://github.com/oracle/truffleruby/commit/19f0e6af33651e177691a5ecb5a7a8b1dc3e8409 Reverting the changes in cext_ruby.rb make the example in the description work correctly.

So it seems the semantics $! around calls to C extensions are more complicated than what we have currently.

eregon avatar Mar 03 '23 19:03 eregon

I tried looking in CRuby if/how they clear $! but I couldn't find any special handling of $! for C functions (VM_METHOD_TYPE_CFUNC, vm_call0_body->vm_call0_cfunc_with_frame).

There is some handling of errinfo in rb_vrescue2(), maybe that's related. I wonder if begin/rescue/end is supposed to behave like save $! on begin and restore it at end. Right now we do the save/restore only if an exception happens and if it's handled by a rescue clause in TruffleRuby, and so the save is done just before executing the rescue clause, not as soon as the begin. That might help (but would be some performance overhead).

I found quite a few place doing ->errinfo = Qnil or rb_set_errinfo(Qnil);, it seems usually done at the time of starting to throw an exception. TruffleRuby doesn't set $! on throwing.

The strange behavior is the behavior of this spec which passes on CRuby:

    def err
      $!
    end

    it "is cleared when entering a C method" do
      $!.should == nil
      begin
        raise StandardError
      rescue
        $!.class.should == StandardError
        err.class.should == StandardError
        @s.rb_errinfo().should == nil
      end
    end

I don't understand why rb_errinfo() is nil there and what sets it to nil, that's weird.

OTOH rb_raise()->rb_longjmp()->exc_setup_message()->errinfo_place() has some complex logic where it gets the errinfo from a Ruby frame on the stack instead of just reading ec->errinfo. Maybe that's part of the magic why the ArgumentError has the StandardError as cause in the first example. But it still doesn't explain that spec.

eregon avatar Mar 03 '23 20:03 eregon

@aardvark179 Would you remember where/how CRuby sets rb_errinfo to nil on calls to C methods?

eregon avatar Mar 08 '23 13:03 eregon

I'm looking at this a bit again, @nirvdrum reported a similar issue. With this script with more debug prints:

require "openssl"

begin
  p in_first_begin: $!
  raise StandardError, "aaa"
rescue Exception
  p in_first_rescue: $!
  begin
    # this works since it generates an ArgumentError before calling into C-ext:
    # OpenSSL::Random.random_bytes
    # but this loses the exception context, since it raises ArgumentError from a C-ext:
    OpenSSL::Random.random_bytes(-1)
    p in_second_begin: $!
  rescue ArgumentError
    p in_second_rescue: $!
  end

  p after_second_begin_rescue_end: $!

  # should raise StandardError "aaa" but raises RuntimeError ""
  raise
  # this works and raises the StandardError "aaa"
  # raise err
end

We get:

ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
{:in_first_begin=>nil}
{:in_first_rescue=>#<StandardError: aaa>}
{:in_second_rescue=>#<ArgumentError: negative string size (or size too big)>}
{:after_second_begin_rescue_end=>#<StandardError: aaa>}
cext_exc.rb:5:in `<main>': aaa (StandardError)

truffleruby 24.0.1, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
{:in_first_begin=>nil}
{:in_first_rescue=>#<StandardError: aaa>}
{:in_second_rescue=>#<ArgumentError: negative string size (or size too big)>}
{:after_second_begin_rescue_end=>nil}
cext_exc.rb:21:in `<main>': 
unhandled exception

So $! is nil at after_second_begin_rescue_end, because when exiting the nested rescue we reset $! to the value it had before entering the rescue, which is nil because https://github.com/oracle/truffleruby/blob/1ee958f355db44a95bf961aa626b7f7120d39425/lib/truffle/truffle/cext_ruby.rb#L37 If we just revert https://github.com/oracle/truffleruby/commit/19f0e6af33651e177691a5ecb5a7a8b1dc3e8409 changes in cext_ruby.rb we fail that spec though (OTOH, the rest of :cext specs passes, and jt test mri --cext too, but I think @aardvark179 did this change to fix pg or some other C ext so we should check that).

eregon avatar Apr 25 '24 12:04 eregon

I filed a CRuby issue for this because it's not clear to me if clearing $! on calling a method defined in C is intended or a CRuby bug: https://bugs.ruby-lang.org/issues/20455

eregon avatar Apr 25 '24 17:04 eregon

After thinking about this more and what I understood from the CRuby semantics and discussion with Koichi, I think the best fix for now is to use a completely separate storage for rb_errinfo/rb_set_errinfo, i.e. to have a new field in RubyFiber for that. And so $! is completely independent of rb_errinfo, which seems the case in CRuby except when there is no rescue and no ensure on the stack (which seems rare overall and something no one should rely on). @andrykonchin Could you do that, and add a spec based on the description's example?

BTW, currently we have ThreadLocalGlobals in RubyThread for both $! and $?. $? is indeed per thread and not per Fiber:

$ ruby -ve 'Fiber.new { `true`; p $? }.resume; p $?'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
#<Process::Status: pid 89794 exit 0>
#<Process::Status: pid 89794 exit 0>

But $! should be per Fiber:

$ ruby -ve 'Fiber.new { begin; raise "hi"; rescue; p $!; Fiber.yield; p :a; end }.resume; p $!'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
#<RuntimeError: hi>
nil

So $? should become a direct field of RubyThread, and $? a direct field of RubyFiber. Neither should be added to getAdjacentObjects since they are only reachable from the current thread/fiber (same as before).

eregon avatar May 06 '24 18:05 eregon

I managed to reproduce a similar error, but using yaml:

require 'yaml'

def function
  raise StandardError, "custom message"
rescue StandardError
  puts $!.message
  data = "yaml: str"
  valid_result = YAML.unsafe_load(data)
  puts "Valid yaml" if valid_result == {"yaml" => "str"}

  raise
end

begin
  function
rescue Exception => e
  puts "#{e.class}: '#{e.message}'"
  puts e.backtrace
end
ruby -v
truffleruby 24.0.1, like ruby 3.2.2, Oracle GraalVM Native [aarch64-darwin]

andrii-hrushetskyi avatar Jul 04 '24 14:07 andrii-hrushetskyi