resque-pool icon indicating copy to clipboard operation
resque-pool copied to clipboard

Resque workers die with Redis reconnect errors after kill -HUP

Open wollkind opened this issue 11 years ago • 3 comments

After I issue kill -HUP to the master process, I get an endless loop of workers that fail to start with the following error:

Failed to start worker : #Redis::InheritedError: Tried to use a connection from a child process without reconnecting. You need to reconnect to Redis after forking.

I'm trying to dig deeper and find out what's going on, but not getting too far. The initial batch of workers start up and process jobs just fine, but after a HUP nothing can start.

wollkind avatar Dec 26 '13 18:12 wollkind

Browsing through old closed issues here, I found another one which references this, and the suggested solution to do something like:

task "resque:pool:setup" do
  ActiveRecord::Base.connection.disconnect!
  Resque::Pool.after_prefork do
    ActiveRecord::Base.establish_connection
    Resque.redis.client.reconnect
  end
end

Seems to fix this for me, but I'm not sure why the initial workers would work fine and then subsequent ones would not.

wollkind avatar Dec 26 '13 18:12 wollkind

I had exactly the same problem and @wollkind solution works for me too.

It would be good to at least add a small bit of info to the README file about this issue. FAQ entry maybe ?

pbc avatar Mar 10 '14 14:03 pbc

I believe I'm running into the same issue here as well. I'm coming here from trying to get logrotate configured properly and I have the "lastaction" set to send the HUP signal to the resque-pool manager PID. This is what we're told to do in the resque-pool README.

My workers restart, but then are immediately reaped and go away.

For example, in tailing the resque-pool.stdout.log after logrotate has run, I just see this again and again:

resque-pool-worker[current][30047]: Starting worker ip-10-164-16-239:30047:*
resque-pool-worker[current][30050]: Starting worker ip-10-164-16-239:30050:compilers
resque-pool-worker[current][30055]: Starting worker ip-10-164-16-239:30055:image_maker_queue
resque-pool-worker[current][30058]: Starting worker ip-10-164-16-239:30058:image_maker_queue
resque-pool-worker[current][30063]: Starting worker ip-10-164-16-239:30063:mailers
resque-pool-worker[current][30066]: Starting worker ip-10-164-16-239:30066:mailers
resque-pool-worker[current][30071]: Starting worker ip-10-164-16-239:30071:*
resque-pool-worker[current][30075]: Starting worker ip-10-164-16-239:30075:*
resque-pool-manager[current][31154]: Reaped resque worker[30047] (status: 0) queues: compilers,image_maker_queue,mailers
resque-pool-manager[current][31154]: Reaped resque worker[30050] (status: 0) queues: compilers
resque-pool-manager[current][31154]: Reaped resque worker[30055] (status: 0) queues: image_maker_queue
resque-pool-manager[current][31154]: Reaped resque worker[30058] (status: 0) queues: image_maker_queue
resque-pool-manager[current][31154]: Reaped resque worker[30063] (status: 0) queues: mailers
resque-pool-manager[current][31154]: Reaped resque worker[30066] (status: 0) queues: mailers
resque-pool-manager[current][31154]: Reaped resque worker[30071] (status: 0) queues: compilers,image_maker_queue,mailers
resque-pool-manager[current][31154]: Reaped resque worker[30075] (status: 0) queues: compilers,image_maker_queue,mailers

justinperkins avatar Apr 03 '14 16:04 justinperkins