curb icon indicating copy to clipboard operation
curb copied to clipboard

Segmentation fault when adding Curl::Easy instances to a Curl::Multi

Open ghost opened this issue 15 years ago • 13 comments

I'm getting a segmentation fault when I add Curl:Easy instances to a running Curl::Multi instance. The Curl:Easy instances are added from on_success / on_failure handlers. The problem is sporadic but shows up eventually after a couple of minutes when running in a continuous loop.

I'm currently running Ruby 1.9.1 and Curb 0.6.0.0. I have tested with different versions of Curb but get always the same result.

Any idea?

This is the console dump:

[BUG] Segmentation fault ruby 1.9.1p376 (2009-12-07 revision 26041) [x86_64-linux]

-- control frame ---------- c:0013 p:---- s:0044 b:0044 l:000043 d:000043 CFUNC :add c:0012 p:0105 s:0040 b:0040 l:001628 d:001628 METHOD feedzilla.rb:86 c:0011 p:0035 s:0033 b:0033 l:002408 d:000032 BLOCK feedzilla.rb:69 c:0010 p:---- s:0030 b:0030 l:000029 d:000029 FINISH c:0009 p:---- s:0028 b:0028 l:000027 d:000027 CFUNC :call c:0008 p:---- s:0026 b:0026 l:000025 d:000025 CFUNC :perform c:0007 p:0117 s:0023 b:0023 l:000022 d:000022 METHOD feedzilla.rb:53 c:0006 p:0024 s:0016 b:0016 l:0023f8 d:000015 BLOCK feedzilla.rb:96 c:0005 p:---- s:0013 b:0013 l:000012 d:000012 FINISH c:0004 p:---- s:0011 b:0011 l:000010 d:000010 CFUNC :each c:0003 p:0084 s:0008 b:0008 l:0023f8 d:001418 EVAL feedzilla.rb:95 c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH

c:0001 p:0000 s:0002 b:0002 l:0023f8 d:0023f8 TOP

-- Ruby level backtrace information-----------------------------------------

feedzilla.rb:86:in add feedzilla.rb:86:in add_url_to_multi feedzilla.rb:69:in block in add_url_to_multi feedzilla.rb:53:in call feedzilla.rb:53:in perform feedzilla.rb:53:in fetch feedzilla.rb:96:in block in

feedzilla.rb:95:in each feedzilla.rb:95:in

-- C level backtrace information ------------------------------------------- 0x4eb69b ruby(rb_vm_bugreport+0x3b) [0x4eb69b] 0x51a160 ruby [0x51a160] 0x51a2d1 ruby(rb_bug+0xb1) [0x51a2d1] 0x495cef ruby [0x495cef] 0x2aaaaacda0f0 /lib/libpthread.so.0 [0x2aaaaacda0f0] 0x49bb1e ruby(st_lookup+0xe) [0x49bb1e] 0x4d7f4f ruby [0x4d7f4f] 0x4d7fb3 ruby(rb_get_method_body+0x23) [0x4d7fb3] 0x4db653 ruby [0x4db653] 0x4dbed7 ruby(rb_funcall+0x147) [0x4dbed7] 0x4a0f43 ruby(rb_obj_as_string+0x83) [0x4a0f43] 0x2aaaac03952d /usr/local/ruby-1.9.1-p376/lib/ruby/gems/1.9.1/gems/curb-0.6.0.0/lib/curb_core.so(ruby_curl_easy_setup+0x80d) [0x2aaaac03952d] 0x2aaaac033e4f /usr/local/ruby-1.9.1-p376/lib/ruby/gems/1.9.1/gems/curb-0.6.0.0/lib/curb_core.so(ruby_curl_multi_add+0x7f) [0x2aaaac033e4f] 0x4d97a0 ruby [0x4d97a0] 0x4dea90 ruby [0x4dea90] 0x4dfaa4 ruby [0x4dfaa4] 0x4e4c5b ruby [0x4e4c5b] 0x4e740d ruby(rb_vm_invoke_proc+0x42d) [0x4e740d] 0x4db7d9 ruby [0x4db7d9] 0x4dbed7 ruby(rb_funcall+0x147) [0x4dbed7] 0x2aaaac033ab1 /usr/local/ruby-1.9.1-p376/lib/ruby/gems/1.9.1/gems/curb-0.6.0.0/lib/curb_core.so [0x2aaaac033ab1] 0x2aaaac033d1b /usr/local/ruby-1.9.1-p376/lib/ruby/gems/1.9.1/gems/curb-0.6.0.0/lib/curb_core.so(ruby_curl_multi_perform+0x21b) [0x2aaaac033d1b] 0x4d97a0 ruby [0x4d97a0] 0x4dea90 ruby [0x4dea90] 0x4dfaa4 ruby [0x4dfaa4] 0x4e4c5b ruby [0x4e4c5b] 0x4e5367 ruby [0x4e5367] 0x4e63cc ruby(rb_yield+0x6c) [0x4e63cc] 0x46b9f1 ruby [0x46b9f1] 0x4d97a0 ruby [0x4d97a0] 0x4dea90 ruby [0x4dea90] 0x4dfaa4 ruby [0x4dfaa4] 0x4e4c5b ruby [0x4e4c5b] 0x4e4e29 ruby(rb_iseq_eval_main+0xa9) [0x4e4e29] 0x4195dc ruby(ruby_exec_node+0xac) [0x4195dc] 0x41ad43 ruby(ruby_run_node+0x33) [0x41ad43] 0x41817d ruby(main+0x4d) [0x41817d] 0x2aaaab7d0466 /lib/libc.so.6(__libc_start_main+0xe6) [0x2aaaab7d0466] 0x418069 ruby [0x418069]

ghost avatar Dec 22 '09 16:12 ghost

Do you have a sample use case?

taf2 avatar Jan 15 '10 03:01 taf2

Not sure what you mean with use case? I have sent you the the code and samples from valgrind.

ghost avatar Jan 15 '10 09:01 ghost

Andre, are you by any chance setting the headers anywhere in your code? We are seeing the exact same problem, and looking at our backtrace, it segfaults when we are allocating a new curl instance and are trying to initialize some headers (specifically, it's the first time we try to access the header hash)... hence the st_lookup segfault.

taf2, any ideas or suggestions?

igrigorik avatar Jan 29 '10 02:01 igrigorik

Ilya, we are setting the usual headers: accept, user_agent etc. The first time I have seen this problem it has not been related to setting headers butwhen one of the callback methods is called. We where using gem v0.6.0 at this time. I did some tests with valgrind with gem v0.6.4. With this version the segfault looked like what you described. We run on EC2. For some reason we see the problem only on 64 bit machines. Our (crapy) fix at the moment is to run the crawlers on small instances.

ghost avatar Jan 29 '10 07:01 ghost

Hmm, same setup. EC2, 64 bit (xlarge). CentOS 5.1.

Latest & greatest version curl on the server, etc.

igrigorik avatar Jan 29 '10 18:01 igrigorik

Ilya, I'm going to use your suggest of the first access on a my 64 box and see if i can reproduce...

taf2 avatar Jan 29 '10 21:01 taf2

Todd, also I should mention that I'm seeing this problem in exact same setup as described above: within the success callback I'm creating a new curl easy instance and queuing it up into the multi-loop. The idea is to keep the multi loop always occupied, always running several hundred connections.

igrigorik avatar Jan 30 '10 03:01 igrigorik

Can you try this patch: http://gist.github.com/290650

taf2 avatar Jan 30 '10 17:01 taf2

Also, in HEAD I changed the order of easy_setup and multi_add. My thought is that maybe there are bad easy handles getting into the multiple handle... so before if an easy handle was added and than raised an exception the multi handle could be left in a bad state. Now if the easy handle is going to raise it'll raise before it gets into the multi handle... maybe give this a try after the other patch above and please let me know the effects... to really isolate this issue down to something repeatable... would be ideal..

taf2 avatar Jan 30 '10 21:01 taf2

Grabbed HEAD and applied your patch -- running now. The challenge is and will be in reproducing the actual bug. We use curl-multi interface to drive our downloaders, and sometimes they go for hours before the process falls down. I haven't been able to reproduce this problem reliably before..

Interestingly enough though, I have run extended tests against local endpoints (nginx server) and same code did not fail me there. It's almost like it has something to do with a specific site / URL.

Last but not least: we're actually using 0.4.6.0 in production at the moment. I recently upgraded several of our production boxes to latest gem, but for some reason, our throughput dropped by more than 2x immediately following the upgrade. We couldn't spot any obvious problems after several hours of investigation, and ended up reverting to the older version -- perhaps something to look into. The same segfault problem showed up in 0.6.x release as well though.

ig

igrigorik avatar Jan 31 '10 03:01 igrigorik

Aha, I think we're on the right track. Got a different stack trace this time. Looks like it's SEGV'ing when it tries to invoke the on_failure callback:

/pr/core/app/keystone/downloader.rb:136: [BUG] Segmentation fault ruby 1.9.2dev (2009-07-18 trunk 24186) [x86_64-linux]

-- control frame ---------- c:0030 p:---- s:0094 b:0094 l:000093 d:000093 CFUNC :on_failure c:0029 p:0011 s:0091 b:0091 l:000228 d:000090 BLOCK /pr/core/app/keystone/downloader.rb:136 c:0028 p:---- s:0088 b:0088 l:000087 d:000087 FINISH c:0027 p:---- s:0086 b:0086 l:000085 d:000085 CFUNC :call c:0026 p:---- s:0084 b:0084 l:000083 d:000083 CFUNC :new

igrigorik avatar Jan 31 '10 05:01 igrigorik

I wonder if this is bug in the version of ruby 1.9.2 ?

taf2 avatar May 09 '10 23:05 taf2

I wonder if the recent refactoring for issue 24 has had additional impact on this issue? Also, I was looking at a few of the bug fixes in most recent versions of libcurl and was thinking they could also be suspect....

taf2 avatar Jun 22 '10 01:06 taf2