truffleruby
truffleruby copied to clipboard
Native access required to load rubygems or use files, instead of allowIO
I'm trying out truffleruby 20.2.0 embedded with a custom filesystem and native-access disabled. Calling "open" causes a SecurityError to be thrown "native access is not allowed." Looking at Truffle::POSIX in posix.rb, it seems that it's checking platform-native option based on Env.isNativeAccessAllowed() method, when I would have expected it to check the Env.isIOAllowed() flag instead.
Is there a roadmap to support mapping posix to Truffle's Custom FileSystem with native-access disabled?
TruffleRuby uses native posix calls for IO currently, not much TruffleFile.
That's partly because Ruby exposes file descriptors directly through IO#fileno
and some programs might actually pass that to native code.
It might be possible to use TruffleFile internally (at the expense of not implementing IO#fileno
(and others) or returning a dummy value), or to run on GraalVM EE where Sulong in --llvm.managed
mode can intercept those posix calls and avoid native calls entirely.
Could you elaborate on your use-case?
My feeling is a lot of Ruby depends on real native posix calls, so with native-access disabled it's hard to run anything but trivial Ruby programs. Supporting RubyGems with a custom FileSystem seems quite challenging too.
There's an existing rubygem I'm trying to load in from a jar file for a specific internal testing use case. I'm trying out some existing ruby scripts that have a require
at the start. Without native access the networking doesn't work, but that's part of the point as I want to restrict the networking to loopback and I could duck-type a replacement for my use case. I figured I could work around this issue like I did with a similar use case with graalpython. The existing truffle filesystem work was done for python test cases and with a few hiccups around disk layout leakage, it works with native access disabled by allowIO on.
I can load the .rb files in one by one, but then I have to remove the "require" from the existing scripts or otherwise override the use of "require." And duplicate all the truffle filesystem mapping work I did to expose the jar file in the first place.
Didn't we used to have the option for managed IO?
@chrisseaton Not as far as I remember, but we have an option to use polyglot streams for stdin/stderr/stdout.
@steventamm Thanks for sharing your use case. Actually loading files (via #require or #load) should use mostly TruffleFile and that might work with a custom filesystem, although it probably needs a few fixes. Do you have a backtrace when trying to load a file from the custom filesystem?
OTOH actually supporting Kernel#open or IO#open seems much harder.
Is there a magic option to get a better stack trace? Just run this from junit.
public void testCase() {
Context ctx = Context.newBuilder("ruby").allowNativeAccess(false).allowIO(true).build();
ctx.eval("ruby", "require(\"rubygems\")");
}
And you get this.
native access is not allowed (SecurityError)
at <ruby> <top (required)>(execute_anonymous_apex:1:0-17)
at <ruby> parsing-request(Unknown)
at org.graalvm.sdk/org.graalvm.polyglot.Context.eval(Context.java:345)
Not helpful, but the only reference would be from posix.rb.
Can you run with --exceptions-print-java
.
public void testCase() {
Context ctx = Context.newBuilder("ruby").allowNativeAccess(false).allowIO(true)
.allowExperimentalOptions(true)
.option("ruby.exceptions-print-java", "true")
.option("ruby.backtraces-raise", "true")
.build();
ctx.eval("ruby", "require(\"rubygems\")");
}
raise: <internal:core> core/posix.rb:28:in `resolve': native access is not allowed (SecurityError)
from <internal:core> core/posix.rb:107:in `attach_function_eagerly'
from <internal:core> core/posix.rb:98:in `truffleposix_stat_mode'
from <internal:core> core/file.rb:383:in `directory?'
from ${jdk.home}/graalvm-ce-java11-20.2.0/Contents/Home/languages/ruby/lib/truffle/rbconfig.rb:65:in `RbConfig'
from ${jdk.home}/graalvm-ce-java11-20.2.0/Contents/Home/languages/ruby/lib/truffle/rbconfig.rb:37:in `<top (required)>'
from <internal:core> core/kernel.rb:257:in `require'
from ${jdk.home}/graalvm-ce-java11-20.2.0/Contents/Home/languages/ruby/lib/mri/rubygems.rb:9:in `<top (required)>'
from <internal:core> core/kernel.rb:257:in `require'
from ${jdk.home}/graalvm-ce-java11-20.2.0/Contents/Home/languages/ruby/lib/patches/rubygems.rb:1:in `<top (required)>'
from <internal:core> core/kernel.rb:257:in `require'
from Unnamed:1:in `<top (required)>'
Does that make the underlying problem clear? We can't load RubyGems without being able to do native IO, to implement a method like directory?
. Should that method just be implemented simply in Java? Maybe, but this is the code we started with.
If it isn't supported, it isn't supported. But graaljs and graalpython intercept these calls and go back through the nio filesystem. All python io goes through truffl.
https://github.com/graalvm/graalpython/issues/123
Fundamentally, if you allow native access you pretty much have to allow all access - native code can do whatever it wants, after all. If you use Enterprise Edition, you can use the fully managed LLVM execution, and then you need neither native access nor IO permissions.
We currently don't support installing into the internal filesystem. The recommended approach is to allow IO access, but install a custom FileSystem when you create the context. All filesystem access in Python goes through the Graal Context filesystem, so that's the intercession point you should use there. This approach would also allow you to intercept the access to libSystem.B.dylib and use a e.g. a whitelist to serve it.
So I'm guessing this is a feature request to use the Truffle FileSystem for fileio in ruby instead of requiring native access.
Yes we should impelment this.
Just for history and exposition: all our IO goes through native POSIX calls because we already had a Ruby core-library implementation written in Ruby that used FFI to call POSIX from Rubinius. That's why it doesn't go through Java and why it doesn't respect allowIO
.
I'm sure this can be fixed.
I was thinking the problem might be deeper than that if e.g. RubyGems loads a C extension.
If that was the case, the only reasonable solution would be to use Managed Sulong and GraalVM EE (C extensions need native access without Managed Sulong).
Luckily it seems the current RubyGems doesn't load any C extension for e.g. require 'ruby2_keywords'
(which is a pure-Ruby gem).
For instance FileUtils loads the etc
C extension (and FileUtils is used by many things). In that specific case not being to load etc
would not be a big deal though.
If it isn't supported, it isn't supported.
Yes, currently it is in general not supported (Ruby IO/File APIs use native access instead of TruffleFile) and it would be a large effort to implement it, so we would need a good motivation for it. I think many gems have a transitive dependency on a C extension, and quite a few core library methods intrinsically need native access (or running in Managed Sulong), so its probably quite limited what can be run without native access anyway.
For the specific case of loading a gem/Ruby files from a custom file system, it's probably best to avoid RubyGems entirely, because that loads a lot of code that might need native access in some way.
Adding the jar to $LOAD_PATH
and using a custom filesystem might work with require
.
Currently, the "path exists" checks for require
are done using java.io.File
, but that should be relatively straightforward to switch to TruffleFile
.
We could give that a try, anyway TruffleFile
should be used instead of java.io.File
for those checks.
if you could share your custom FileSystem and a reproducible example, that would be helpful.
The rubygem I'm using (restforce) doesn't appear to include any C extensions in itself or its dependencies. It does require native access for httpclient calls from the faraday gem, but I was planning on overriding that post load. But I didn't get that far due to this issue.
Since this seems like a large feature request replacing calls to POSIX from file.rb to calls to TruffleFile, I kind of shelved most of this work.
The custom filesystem I'm using is rather complex with zipinputstream -> seekablebytechannel converters, but has a RestrictedFileSystem that allows access to the native filesystem through FileSystem.newDefaultFileSystem
for specific folders on disk. This is needed to handle trufflepython's current inability to handle filesystem mapping for python internals.
See initializeHomeAndPrefix paths
https://github.com/graalvm/graalpython/blob/master/graalpython/com.oracle.graal.python/src/com/oracle/graal/python/runtime/PythonContext.java
This is also how I was planning on handling access to the gems directory and loading of shared libraries in LanguageHome, but it's a fairly straightforward delegate with a "validatePath" call for each IO access.
public void testCase() {
Context ctx = Context.newBuilder("ruby")
.allowNativeAccess(false).allowIO(true)
.allowExperimentalOptions(true)
.fileSystem(FileSystem.newDefaultFileSystem())
.option("ruby.exceptions-print-java", "true")
.option("ruby.backtraces-raise", "true")
.build();
ctx.eval("ruby", "require(\"rubygems\")");
}
We tried this but using TruffleFile causes a significant performance overhead and it's only partial anyway, it seems too difficult to use TruffleFile for regular Ruby IO/File objects. So closing this as not planned.