hiredis-rb icon indicating copy to clipboard operation
hiredis-rb copied to clipboard

hiredis-rb is not GC-safe

Open findchris opened this issue 8 years ago • 26 comments

Hi there.

While investigating a GC-related issue (https://github.com/ohler55/oj/issues/265#issuecomment-147554809), @ohler55 mentioned that he looked at the hiredis-rb C code and saw some stuff that wasn't GC safe and might be the source of the bug I'm investigating (NotImplementedError: method ... called on terminated object). He pointed out that:

Ruby checks the stack for VALUE types which are Ruby object references. Compilers often optimize code by placing variable in registers which are invisible to Ruby as far as checking for wether an object is still being referenced. The work around for this is to make sure all VALUE local variable are volatile. That keeps them out of registers and on the stack.

See the linked-to github issue above, but I'm only seeing this issue a few times per day on an app that gets significant traffic, so it's tough to reproduce, but all signs point to it being GC-related.

I look forward to your reply here.

findchris avatar Oct 13 '15 21:10 findchris

Thanks for the report. I'm quite busy this and the coming week, but I will look into this.

badboy avatar Oct 14 '15 17:10 badboy

Thanks @badboy - At first read, does the addition of volatile to the VALUEs make sense?

findchris avatar Oct 14 '15 20:10 findchris

@badboy - Get a chance to look at this?

findchris avatar Oct 21 '15 21:10 findchris

@badboy Is there anyone else I can @tag here to get a set of eyes on this issue?

findchris avatar Oct 30 '15 18:10 findchris

Hey. Currently I'm the only maintainer, so there's no one else to take a look. But I did and it seems there is a macro to use (RB_GC_GUARD) on those VALUEs instead of making it explicitely volatile.

I'm trying to find the right solution today.

badboy avatar Oct 31 '15 13:10 badboy

@findchris Do you have a test case that triggers the bug reliably? Would be nice to know that any fix we apply actually solves the problem (the other option for me would be to read the assembly and I'm really not good at that)

badboy avatar Oct 31 '15 15:10 badboy

@badboy I don't have a test case, sadly. It seems to be GC-related, and I have been able to reproduce consistently.

@ohler55 Did this comment make sense to you? I just don't know enough to respond intelligently.

findchris avatar Nov 07 '15 06:11 findchris

@badboy and @ohler55 - Any chance to check this out, or is it dying on the vine here?

findchris avatar Nov 15 '15 02:11 findchris

Glad to get involved. Do you have a set of changes? Need a some help and a PR?

ohler55 avatar Nov 15 '15 05:11 ohler55

I don't have any code yet. I can't even reproduce it, which makes it near to impossible to know if any change would actually cover the bug. I appreciate all help, so if you have more insight or know your way around in Ruby extensions (I really don't :D) please tell so.

badboy avatar Nov 15 '15 08:11 badboy

ok, I'll put together some change and attempt to put together a test. It will be a step at a time.

ohler55 avatar Nov 15 '15 17:11 ohler55

Thanks in advance!

badboy avatar Nov 15 '15 17:11 badboy

Thanks guys; I really appreciate you stepping up on this @ohler55 :clap:

findchris avatar Nov 16 '15 19:11 findchris

I am not having much success reproducing the failure here. @findchris, can you run tests on changes we make to verify when the changes work? You can be our tester. Less than ideal but it should be enough.

ohler55 avatar Nov 18 '15 23:11 ohler55

I can lock production to a particular commit, which should work.

findchris avatar Nov 19 '15 00:11 findchris

It would help a lot. I put up a PR. Maybe batboy can help get everything squared away on branches.

ohler55 avatar Nov 19 '15 00:11 ohler55

Thanks @ohler55.
@badboy Can you look at https://github.com/redis/hiredis-rb/pull/43?

findchris avatar Nov 19 '15 01:11 findchris

@badboy Did you get a chance to review #43?

findchris avatar Nov 23 '15 22:11 findchris

Checking in @badboy #squeakywheel

findchris avatar Dec 03 '15 00:12 findchris

I'm sorry this is post-poned so long, but I have to shift it a bit again (upcoming holiday and I just need a break from this). I take a fresh look when I'm back in 2 weeks.

badboy avatar Dec 03 '15 10:12 badboy

Totally understandable. Enjoy your holiday!

findchris avatar Dec 03 '15 21:12 findchris

This is still an issue for us.

@badboy - Did you get a chance to look at this issue?

findchris avatar Apr 06 '17 21:04 findchris

I merged, but never pushed a new release. Will take care of that ASAP

badboy avatar Apr 07 '17 09:04 badboy

Hey there! Did anyone managed to reproduce the issue described by @findchris since the latest release? Is it still open?

mberlanda avatar Nov 03 '18 02:11 mberlanda

Bump - what is the status of this @badboy?

Should this be closes or is this still an issue?

tonydehnke avatar Oct 09 '19 01:10 tonydehnke

I'm not working on hiredis-rb anymore.

badboy avatar Oct 09 '19 07:10 badboy