joni icon indicating copy to clipboard operation
joni copied to clipboard

regexp causes hang in jruby but terminates in MRI

Open jsvd opened this issue 5 years ago • 5 comments

In MRI 2.6:

% ruby -v 
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
% ruby -e 'puts "foo========:bar baz================================================bingo".scan(/(?:=+=+)+:/)'
========:

With Latest JRuby snapshot:

/tmp/jruby-9.2.8.0-SNAPSHOT % java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
/tmp/jruby-9.2.8.0-SNAPSHOT % jruby -v
jruby 9.2.8.0-SNAPSHOT (2.5.3) 2019-07-19 b416404 OpenJDK 64-Bit Server VM 11.0.1+13 on 11.0.1+13 +jit [darwin-x86_64]
/tmp/jruby-9.2.8.0-SNAPSHOT % jruby -e 'puts "foo========:bar baz================================================bingo".scan(/(?:=+=+)+:/)'
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jruby.util.SecurityHelper to field java.lang.reflect.Field.modifiers
WARNING: Please consider reporting this to the maintainers of org.jruby.util.SecurityHelper
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

[edit] Also hangs on jruby 1.7.27, 9.2.5.0, 9.2.7.0

jsvd avatar Jul 19 '19 10:07 jsvd

using =~ seems to match properly, so the problem seems to be related to advancing in either scan or subsequent joni match

lopex avatar Jul 19 '19 15:07 lopex

split also hangs, so it's a joni issue

lopex avatar Jul 19 '19 15:07 lopex

I can reproduce this in onigmo using subsequent match via:

onig_search(reg, str, (str + SLEN(str)), str + 12, (str + SLEN(str)), &region, ONIG_OPTION_NONE);

The problem is that we always use *str as zero and only start at *start argument which causes the trouble. We will have to workaround it somehow or change matching API's unfortunately.

lopex avatar Jul 19 '19 16:07 lopex

so after a quick&dirty https://gist.github.com/lopex/be0d7fddf2eabb62ee371f9beb9ca47b and using:

onig_search(reg, str + 12, (str + SLEN(str)), str, (str + SLEN(str)), &region, ONIG_OPTION_NONE);

produces: onig_search (entry point): str: 4299173916 (0x10040301c), end: 60, start: 18446744073709551604, range: 60

and in joni:

matcher.search(12, str.length, 0, str.length, option);

which produces: onig_search (entry point): str: 12, end: 60, start: -12, range 60

we match onigmo, but there's something wrong in onigmo since there's either overflow or a bad signed/unsigned cast there.

lopex avatar Jul 19 '19 16:07 lopex

for the former case, after shortenning the input a bit:

"foo========:bar baz==========================bingo"

joni also completes, after w few seconds.

lopex avatar Jul 19 '19 16:07 lopex