truffleruby
truffleruby copied to clipboard
`rb_str_resize(Qfalse, 0)` with NKF and without C extensions lock
Found by @mohamedhafez and reported on Slack:
class: TypeError
TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0
from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
string.c:197:in `rb_str_resize': TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0 (TypeError)
from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
from string.c:197:in `rb_str_resize'
from /Users/mohamed/.rbenv/versions/truffleruby+graalvm-24.0.1/lib/truffle/truffle/cext_ruby.rb:40:in `guess'
from /Users/mohamed/rubyprojects/sa-mechanize/lib/mechanize/util.rb:58:in `detect_charset'
from /Users/mohamed/rubyprojects/substitutealert/lib/ext/mechanize_additions.rb:22:in `initialize'
So this must be a call to rb_str_resize(str, 0) with str=0 but of course it shouldn't be 0.
Looking at NKF sources, there is a single call to rb_str_resize:
https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L47
result is a global variable :fearful:
https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L40
https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L164-L168
So that is very likely the issue. One solution is to use a native thread-local variable instead. Or pass it as an extra parameter but that would require touching more of the upstream nkf code.
Or, always use the C extensions lock/a lock for NKF.
It seems to me all the global variables could lead to such issues: https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L32-L38
So all these variables probably should be passed everywhere (as a struct?). They are used only in this file so it doesn't seem a big effort to fix. Does it makes sense to do it?
The problem is e.g. rb_nkf_putchar() is a callback called as putchar() from src/main/c/nkf/nkf-utf8/nkf.c.
So I think it's too messy to pass a struct there.
Let's try using thread_local (https://en.cppreference.com/w/c/thread/thread_local) or __thread on these variables, that should require minimal changes.
Created a PR https://github.com/ruby/nkf/pull/19