truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

Multiple errors (function rb_frame_method_id_and_class cannot be found; internal exception) with torch.rb

Open mtortonesi opened this issue 2 years ago • 5 comments

I am trying to get torch.rb working with TruffleRuby (version truffleruby+graalvm-23.0.0 installed with rbenv+ruby_build on a M1 MacBookPro), but unfortunately when I launch the following super simple command that generates a 1D tensor from the data contained in an array:

bundle exec ruby -e 'require "torch"; Torch::Tensor.new([1,2,3,4])'

I get the following error:

/Users/mauro/code/test/truffleruby/reinforce/vendor/bundle/truffleruby/3.1.3.23.0.0/gems/rice-4.1.0/include/rice/rice.hpp:4094:in `call': External LLVMFunction rb_frame_method_id_and_class cannot be found. (Polyglot::ForeignException)
	from /Users/mauro/.rbenv/versions/truffleruby+graalvm-23.0.0/graalvm/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:41:in `'
	from /Users/mauro/.rbenv/versions/truffleruby+graalvm-23.0.0/graalvm/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:41:in `Torch::TensorOptions#initialize'
	from /Users/mauro/code/test/truffleruby/reinforce/vendor/bundle/truffleruby/3.1.3.23.0.0/gems/torch-rb-0.13.2/lib/torch.rb:529:in `Torch.tensor_options'
	from /Users/mauro/code/test/truffleruby/reinforce/vendor/bundle/truffleruby/3.1.3.23.0.0/gems/torch-rb-0.13.2/lib/torch.rb:429:in `Torch.tensor'
	from /Users/mauro/code/test/truffleruby/reinforce/vendor/bundle/truffleruby/3.1.3.23.0.0/gems/torch-rb-0.13.2/lib/torch.rb:281:in `#<0x3b58>

<><>

mtortonesi avatar Aug 10 '23 19:08 mtortonesi

I upgraded to truffleruby+graalvm-dev (also adding the https://github.com/oracle/truffleruby/pull/3151 patch), but the problem still persists.

mtortonesi avatar Aug 11 '23 13:08 mtortonesi

Apparently, the problem is that the rice gem does not work with TruffleRuby: https://github.com/jasonroelofs/rice/issues/189

Any suggestion?

mtortonesi avatar Aug 16 '23 08:08 mtortonesi

External LLVMFunction rb_frame_method_id_and_class cannot be found.

So that means that C API function is not yet implemented in TruffleRuby. It needs to be implemented to fix that one.

eregon avatar Aug 16 '23 09:08 eregon

Regarding the second error

unsupported type [2 x i64] in native interop (com.oracle.truffle.llvm.runtime.NativeContextExtension.UnsupportedNativeTypeException)

I think that means a struct by value is used as an argument or return type, and NFI does not support that yet. It would be good to confirm if that's the case based on the backtrace and C source code. https://github.com/oracle/truffleruby/issues/3118 might solve this struct-by-value in native extensions in general.

eregon avatar Aug 16 '23 09:08 eregon

It would be good to confirm if that's the case based on the backtrace and C source code.

Could you quote here what these two lines contain?

/opt/homebrew/Cellar/pytorch/2.0.1/include/torch/csrc/autograd/generated/variable_factories.h:261:in `empty_symint'
	from /Users/mauro/code/test/truffleruby/reinforce/vendor/bundle/truffleruby/3.1.3.23.0.0/gems/torch-rb-0.13.2/ext/torch/torch_functions.cpp:4970:in `torch_empty'

eregon avatar Aug 16 '23 10:08 eregon