truffleruby icon indicating copy to clipboard operation
truffleruby copied to clipboard

`rb_str_resize(Qfalse, 0)` with NKF and without C extensions lock

Open eregon opened this issue 1 year ago • 3 comments

Found by @mohamedhafez and reported on Slack:

class: TypeError
TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
	from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
string.c:197:in `rb_str_resize': TruffleRuby doesn't have a case for the org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen node with values of type java.lang.Boolean=false java.lang.Integer=0 (TypeError)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.executeAndSpecialize(CExtNodesFactory.java:4234)
	from org.truffleruby.cext.CExtNodesFactory$RbStrResizeNodeFactory$RbStrResizeNodeGen.execute(CExtNodesFactory.java:4212)
	from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
	from string.c:197:in `rb_str_resize'
	from /Users/mohamed/.rbenv/versions/truffleruby+graalvm-24.0.1/lib/truffle/truffle/cext_ruby.rb:40:in `guess'
	from /Users/mohamed/rubyprojects/sa-mechanize/lib/mechanize/util.rb:58:in `detect_charset'
	from /Users/mohamed/rubyprojects/substitutealert/lib/ext/mechanize_additions.rb:22:in `initialize'

So this must be a call to rb_str_resize(str, 0) with str=0 but of course it shouldn't be 0.

Looking at NKF sources, there is a single call to rb_str_resize: https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L47

result is a global variable :fearful: https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L40 https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L164-L168

So that is very likely the issue. One solution is to use a native thread-local variable instead. Or pass it as an extra parameter but that would require touching more of the upstream nkf code.

Or, always use the C extensions lock/a lock for NKF.

eregon avatar Jul 22 '24 12:07 eregon

It seems to me all the global variables could lead to such issues: https://github.com/oracle/truffleruby/blob/938589f700c599221f0422fcea6c52f332293390/src/main/c/nkf/nkf.c#L32-L38

So all these variables probably should be passed everywhere (as a struct?). They are used only in this file so it doesn't seem a big effort to fix. Does it makes sense to do it?

andrykonchin avatar Jul 22 '24 14:07 andrykonchin

The problem is e.g. rb_nkf_putchar() is a callback called as putchar() from src/main/c/nkf/nkf-utf8/nkf.c. So I think it's too messy to pass a struct there. Let's try using thread_local (https://en.cppreference.com/w/c/thread/thread_local) or __thread on these variables, that should require minimal changes.

eregon avatar Jul 22 '24 15:07 eregon

Created a PR https://github.com/ruby/nkf/pull/19

andrykonchin avatar Jul 23 '24 13:07 andrykonchin