joni
joni copied to clipboard
regexp causes hang in jruby but terminates in MRI
In MRI 2.6:
% ruby -v
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
% ruby -e 'puts "foo========:bar baz================================================bingo".scan(/(?:=+=+)+:/)'
========:
With Latest JRuby snapshot:
/tmp/jruby-9.2.8.0-SNAPSHOT % java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
/tmp/jruby-9.2.8.0-SNAPSHOT % jruby -v
jruby 9.2.8.0-SNAPSHOT (2.5.3) 2019-07-19 b416404 OpenJDK 64-Bit Server VM 11.0.1+13 on 11.0.1+13 +jit [darwin-x86_64]
/tmp/jruby-9.2.8.0-SNAPSHOT % jruby -e 'puts "foo========:bar baz================================================bingo".scan(/(?:=+=+)+:/)'
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jruby.util.SecurityHelper to field java.lang.reflect.Field.modifiers
WARNING: Please consider reporting this to the maintainers of org.jruby.util.SecurityHelper
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[edit] Also hangs on jruby 1.7.27, 9.2.5.0, 9.2.7.0
using =~ seems to match properly, so the problem seems to be related to advancing in either scan or subsequent joni match
split also hangs, so it's a joni issue
I can reproduce this in onigmo using subsequent match via:
onig_search(reg, str, (str + SLEN(str)), str + 12, (str + SLEN(str)), ®ion, ONIG_OPTION_NONE);
The problem is that we always use *str as zero and only start at *start argument which causes the trouble. We will have to workaround it somehow or change matching API's unfortunately.
so after a quick&dirty https://gist.github.com/lopex/be0d7fddf2eabb62ee371f9beb9ca47b and using:
onig_search(reg, str + 12, (str + SLEN(str)), str, (str + SLEN(str)), ®ion, ONIG_OPTION_NONE);
produces: onig_search (entry point): str: 4299173916 (0x10040301c), end: 60, start: 18446744073709551604, range: 60
and in joni:
matcher.search(12, str.length, 0, str.length, option);
which produces: onig_search (entry point): str: 12, end: 60, start: -12, range 60
we match onigmo, but there's something wrong in onigmo since there's either overflow or a bad signed/unsigned cast there.
for the former case, after shortenning the input a bit:
"foo========:bar baz==========================bingo"
joni also completes, after w few seconds.