grocer icon indicating copy to clipboard operation
grocer copied to clipboard

Segmentation fault in test server

Open tilsammans opened this issue 11 years ago • 20 comments

I've started seeing segmentation faults during test runs of the Grocer server on our CI platform. These do not appear locally, and so far not on production either. The CI and production platform are Ubuntu 12.04 and the Ruby version is identical for all, 2.1.1.

Is this a problem in Grocer or do I have to look elsewhere?

/home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61: [BUG] Segmentation fault at 0x007f26d801dcb0
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0029 e:000028 CFUNC  :sysread
c:0006 p:0019 s:0025 e:000023 METHOD /home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61
c:0005 p:0085 s:0021 e:000020 METHOD /home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:102
c:0004 p:0010 s:0015 e:000014 METHOD /home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21
c:0003 p:0020 s:0008 e:000007 METHOD /home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13
c:0002 p:0026 s:0004 e:000003 BLOCK  /home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/server.rb:25 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
/home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/server.rb:25:in `block (3 levels) in accept'
/home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13:in `each'
/home/rof/cache/bundler/ruby/2.1.0/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21:in `read_notification'
/home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:102:in `read'
/home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61:in `fill_rbuff'
/home/rof/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61:in `sysread'

-- C level backtrace information -------------------------------------------
/home/rof/.rvm/rubies/ruby-2.1.1/lib/libruby.so.2.1(+0x1c73dc) [0x7f26ec4a23dc] vm_dump.c:685
/home/rof/.rvm/rubies/ruby-2.1.1/lib/libruby.so.2.1(+0x74410) [0x7f26ec34f410] error.c:307
/home/rof/.rvm/rubies/ruby-2.1.1/lib/libruby.so.2.1(rb_bug+0xb3) [0x7f26ec3500c3] error.c:334
/home/rof/.rvm/rubies/ruby-2.1.1/lib/libruby.so.2.1(+0x148dde) [0x7f26ec423dde] signal.c:704
/lib/x86_64-linux-gnu/libc.so.6(+0x36c30) [0x7f26ebf4bc30] array.c:759
[0x7f26d801dcb0]

tilsammans avatar Oct 13 '14 12:10 tilsammans

I am having the same problem.

/usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:61: [BUG] Segmentation fault at 0x00000000000000
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0029 e:000028 CFUNC  :sysread
c:0006 p:0019 s:0025 e:000023 METHOD /usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:61
c:0005 p:0085 s:0021 e:000020 METHOD /usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:102
c:0004 p:0010 s:0015 e:000014 METHOD /usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21
c:0003 p:0020 s:0008 e:000007 METHOD /usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13
c:0002 p:0026 s:0004 e:000003 BLOCK  /usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/server.rb:25 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
/usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/server.rb:25:in `block (3 levels) in accept'
/usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13:in `each'
/usr/local/rvm/gems/ruby-2.1.2/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21:in `read_notification'
/usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:102:in `read'
/usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:61:in `fill_rbuff'
/usr/local/rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/openssl/buffering.rb:61:in `sysread'

nonsense avatar Oct 13 '14 14:10 nonsense

Is this a problem in Grocer or do I have to look elsewhere?

These are the first reports of this; I'm not aware of any outstanding/known bugs. Please do investigate, and if you can create a minimal, reproducible example, I might be able to help track down the problem.

stevenharman avatar Oct 13 '14 15:10 stevenharman

I can't reproduce the problem on Nitrous. I'll see if I can find a VM image where it happens.

tilsammans avatar Oct 14 '14 10:10 tilsammans

Unfortunately I cant reproduce it consistently either... it fails "sometimes" - 1 out of 3-4 test runs.

nonsense avatar Oct 15 '14 08:10 nonsense

tl;dr Start and stop a single Grocer server for all tests.

I encountered this same issue when running specs on my local Ubuntu VM. The developer that pushed the code was able to run through the specs on his mac fine.

I noticed that a Grocer server would be initialized and closed for the specs that were failing:

server = Grocer.server(port:2195)
server.accept
  ...
server.close

When I ran each spec individually, I had no issues. But when I ran the whole suite, I was able to consistently get seg faults.

So I moved the grocer server code into a before and after all block in my spec config:

config.before(:all) do
  @grocer_server = Grocer.server(port:2195)
  @grocer_server.accept
end

config.after(:all) do
  @grocer_server.close
end

And now my specs all pass without seg faults.

bayan avatar Oct 22 '14 11:10 bayan

Thanks for the update! Apparently this solved it for you, since all your tests that use Grocer are inside one spec file. If they are spread through many spec files, config.before(:all) doesn't work. I had to wrap the Grocer server/client objects in a singleton, since I couldn't get an instance variable inside config.before(:suite)

nonsense avatar Oct 22 '14 13:10 nonsense

I had to go with the singleton too.

# spec/support/grocer.rb
class GrocerServer
  include Singleton

  def server
    Grocer.server(port: 2195)
  end
end
RSpec.describe "class" do
  let(:grocer_server) { GrocerServer.instance.server }

  before do
    grocer_server.accept
  end

  after do
    grocer_server.close
  end
end

now the segfaults don't appear any more.

tilsammans avatar Oct 27 '14 16:10 tilsammans

Thanks for all of the investigation, all. While not ideal, I'm glad you found a work around. If anyone would like, we'd certainly love to have someone investigate the root cause so we can fix the server code and not have to rely on these workarounds. Anyone have bandwidth to take a deeper dive?

stevenharman avatar Oct 27 '14 17:10 stevenharman

I am sorry, but I will be unable to do that at the moment. I don't have much experience with APNS and iOS, and unfortunately I don't have the time to dive into this now. In case things change on my end, I will create a PR to let you know.

nonsense avatar Oct 28 '14 09:10 nonsense

I am encountering the same error after adding rspec tests for push notifications.

$ rspec
....................................................................................../home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61: [BUG] Segmentation fault at 0x00000000000001
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0029 e:000028 CFUNC  :sysread
c:0006 p:0019 s:0025 e:000023 METHOD /home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61
c:0005 p:0085 s:0021 e:000020 METHOD /home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:102
c:0004 p:0010 s:0015 e:000014 METHOD /home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21
c:0003 p:0020 s:0008 e:000007 METHOD /home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13
c:0002 p:0026 s:0004 e:000003 BLOCK  /home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/server.rb:25 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
/home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/server.rb:25:in `block (3 levels) in accept'
/home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13:in `each'
/home/dan/.rvm/gems/ruby-2.1.1/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21:in `read_notification'
/home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:102:in `read'
/home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61:in `fill_rbuff'
/home/dan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/openssl/buffering.rb:61:in `sysread'

-- C level backtrace information -------------------------------------------
[1]    29349 segmentation fault (core dumped)  rspec

I have tried the above suggestion:

pusher_notification_spec.rb

require 'timeout'
require 'rails_helper'

describe "apple push notifications" do
  let(:grocer_server) { GrocerServer.instance.server }

  before do
    grocer_server.accept
  end

  after do
    grocer_server.close
  end

  # my tests
end

...and added the spec/support/grocer.rb file

All my grocer tests are in one file. When I run the file individually, there is no issues. When I run the whole suite I get a segfault every time. Does anyone have any further suggestions? At the moment my only option is to comment out the tests, as we need to run the whole suite regularly.

danbartlett avatar Nov 28 '14 14:11 danbartlett

Segmentation failure is also happening on my machine and also on CI platforms.

/home/varun/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:61: [BUG] Segmentation fault at 0x736f7288
ruby 2.1.5p273 (2014-11-13 revision 48405) [i686-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0029 e:000028 CFUNC  :sysread
c:0006 p:0019 s:0025 e:000023 METHOD /home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:61
c:0005 p:0085 s:0021 e:000020 METHOD /home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:102
c:0004 p:0010 s:0015 e:000014 METHOD /home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21
c:0003 p:0020 s:0008 e:000007 METHOD /home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13
c:0002 p:0026 s:0004 e:000003 BLOCK  /home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/server.rb:25 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

/home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/server.rb:25:in `block (3 levels) in accept'
/home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:13:in `each'
/home/kiprosh007/.rvm/gems/ruby-2.1.5@gogovan/gems/grocer-0.6.0/lib/grocer/notification_reader.rb:21:in `read_notification'
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:102:in `read'
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:61:in `fill_rbuff'
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/openssl/buffering.rb:61:in `sysread'

-- C level backtrace information -------------------------------------------
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/libruby.so.2.1(+0x1ea390) [0xb760b390] vm_dump.c:690
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/libruby.so.2.1(+0x6c18d) [0xb748d18d] error.c:312
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/libruby.so.2.1(rb_bug+0x40) [0xb748dc40] error.c:339
/home/kiprosh007/.rvm/rubies/ruby-2.1.5/lib/libruby.so.2.1(+0x159883) [0xb757a883] signal.c:812

All the backtraces have one thing in common - Linux machine (I am sure you guys have also observed it). Not sure if that is the cause of this. Does anyone have any further suggestions?

varunlalan avatar Jan 29 '15 13:01 varunlalan

@varunlalan I was going to link to some Issues I've found in other Ruby gems that point to a problem with OpenSSL and or the certificiates, but no I see you've already opened an issue on OpenSSL. :smile: I'm at a loss for where to start as I've never seen this error myself, and am unsure how to reproduce it.

stevenharman avatar Jan 29 '15 16:01 stevenharman

I have one test case when I am using mock server and one test case when I am using Apple development server. The segfault happens only when these two tests are executed together. I removed @server.close and it resolved the issue. Let's see whether it does not have any further negative consequences.

adamniedzielski avatar Mar 23 '15 09:03 adamniedzielski

I was having the same problem, and the only solution I came across was mocking the Grocer::Pusher class as follows:

module Grocer
  class Pusher
    class_attribute :notifications

    def push(notification)
      self.class.notifications ||= []
      self.class.notifications << notification
    end
  end
end

I'm accessing stored notifications using Grocer::Pusher.notifications and it works for my case. I understand that this isn't the right way of testing the grocer push notifications, but I can't figure out any other solution that works reliably. I really appreciate if somebody could please tell a reliable way of handling this error.

kushkella avatar Jun 10 '15 06:06 kushkella

Hey, I'm experiencing a crash in a Sidekiq task that pushes to notifications using APN.

When this segfault happens our whole Sidekiq instance hangs, and I have to restart a worker. It seems like this is happening because Sidekiq depends on Ruby code being threadsafe. Is Grocer considered to be threadsafe?

It looks like the relevant stack trace is this:

/home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/ssl_connection.rb:43: [BUG] Segmentation fault at 0x007fb2886bbcf0

-- Control frame information -----------------------------------------------
c:0046 p:---- s:0210 e:000209 CFUNC  :connect
c:0045 p:0216 s:0207 e:000206 METHOD /home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/ssl_connection.rb:43
c:0044 p:0019 s:0202 e:000201 METHOD /home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/connection.rb:29
c:0043 p:0014 s:0199 e:000198 METHOD /home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/connection.rb:55
c:0042 p:0007 s:0194 e:000193 METHOD /home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/connection.rb:23
c:0041 p:0045 s:0190 e:000189 METHOD /home/deploy/.rvm/rubies/ruby-2.2.2/lib/ruby/2.2.0/delegate.rb:87
c:0040 p:0013 s:0182 e:000181 METHOD /home/deploy/app/shared/bundle/ruby/2.2.0/gems/grocer-0.6.1/lib/grocer/pusher.rb:8

-- Other runtime information -----------------------------------------------

* Loaded script: sidekiq 3.4.1 app [0 of 25 busy]

lukaslozovski avatar Sep 23 '15 15:09 lukaslozovski

I'm experiencing the same issue as @lukaslozovski running grocer from within sidekiq. Stacktrace is identical.

robacarp avatar Nov 03 '15 22:11 robacarp

I have a suspicion that that error is caused by this part https://github.com/grocer/grocer/blob/master/lib/grocer/ssl_connection.rb#L39-l43

It might be that in the context of sidekiq, that piece of code gets run multiple times, and once a @ssl.connect gets called once, another call causes a segfault.

My solution was to move to only sending one notification at a time, which defeats most of the power of sidekiq.

lukaslozovski avatar Nov 03 '15 22:11 lukaslozovski

So I was having the same issue where specs where running into a seg fault in CI. I just removed my acceptance tests to get passed the issue that was really slowing down our development cycle 😞 . It seems this isn't really an issue with grocer but with openssl. Can anyone confirm thats the case?

For what I see the bug is caused by this issue in ruby https://bugs.ruby-lang.org/issues/12292. Seems we have to wait until ruby 2.4 for the issue to be resolved. In the meantime I will try to move my project to ruby 2.4.0-preview1 see if that solves the issue.

Baril27 avatar Sep 09 '16 22:09 Baril27

@Baril27 Nice find. FWIW, the Backport field on that bug says it's been back ported to Ruby 2.3, and it's apparently awaiting a back port to 2.2. So perhaps we'll see it in the next Ruby 2.3 point release!

bug__12292__race_between_openssl__ssl__sslsocket_stop_and__connect_can_cause_a_segmentation_fault_-_ruby_trunk_-_ruby_issue_tracking_system

UPDATE: We will, in fact, see it in the next Ruby point release, v 2.3.2: https://github.com/ruby/ruby/blob/ruby_2_3/ChangeLog

stevenharman avatar Sep 10 '16 18:09 stevenharman

@stevenharman Thank you for pointing that out. Will wait until the 2.3.2 release.

Baril27 avatar Sep 11 '16 18:09 Baril27