delayed_job
delayed_job copied to clipboard
Delayed::Job Always Force Kills on Restart
Restarting delayed jobs always has to forcefully kill the existing process:
RAILS_ENV=production bin/delayed_job restart --pid-dir=/srv/app/shared/pids/
delayed_job: trying to stop process with pid 2929...
delayed_job: process with pid 2929 won't stop, we forcefully kill it...
delayed_job: process with pid 2929 successfully stopped.
I am running rails (4.0.0) and delayed_job_active_record (4.0.0).
I'm not sure if this is a bug or something I am doing. Any ideas on what the problem can be?
Thanks, Tom
I have exactly the same problem.
Same here, in my development environment. delayed_jobs
table is empty. Rails 4, Ruby 2.0.
Anyone have any solutions?
+1
That is the daemons gem being overly aggressive about killing the process. DJ will wait for the current job to finish before it exits. The bad news is that means daemons is force killing an active job at some random point in its execution.
If the jobs table is empty. That means you are in the best case scenario and the worker was in the middle of the sleep delay between checking for new jobs when the daemons gem force kills it. However, even then it means the process isn't able to run any at exit cleanup, like properly closing open database connections.
We will need to see if we can tell the daemons gem to lay off and let us finish.
@albus522 That sounds great, but hasn't been my experience. Even with absolutely no jobs in the table, it still has to kill the process.
@tomrossi7 Did you read the second paragraph of my response?
@albus522 Sorry, I'm not trying to be a jerk, I don't understand it. I'm not sure why the daemons gem needs to lay off? Are you saying it needs to give even more time for the the process to wake up so it can stop it?
Yes. The best I can tell newer daemons builds give the process 20 seconds to exit or it hard terminates the process. If you have no jobs running and have the default DJ configuration, that is fine as the sleep_delay is 5 seconds. So DJ will typically exit just fine within that 20 second window.
However if the user modifies the sleep_delay or a job is running, that window can be much longer than 20 seconds. The default max run time is 4 hours, and both the max run time and sleep_delay can be set to anything the user wants.
So, in the case of DJ, the decision to hard terminate the worker should never be made by the daemons gem as it doesn't know what it should do. The daemons gem has been a continual source of headaches for us, but unfortunately we haven't found anything better yet.
Thank you for this explanation. Our sleep delay is 60 seconds, so it explains the problem completely.
Would love to see configureable wait time on the stop script. In the meantime we'll just reduce the sleep delay.
Ah! I lowered my sleep delay and now it can restart without killing the process!
https://github.com/collectiveidea/delayed_job/pull/916
daemons gem seems to set it to 20 seconds by default:
https://github.com/thuehlinger/daemons/blob/0ea14143c375f0bec117eb2a7ae2f78623b83867/lib/daemons/application.rb#L37
Though seems it can be specified with force_kill_waittime
parameter from:
https://github.com/collectiveidea/delayed_job/blob/73bd1b50e719b336b70fcbb8dc4a37ec9b2f6f35/lib/delayed/command.rb#L123
I think this Issue can be closed since it's not an issue with the delayed_job
library, but rather whatever process manager you have that is waiting for the delayed_job
to exit cleanly after a SIGINT
before forcefully killing it with a SIGTERM
or SIGKILL
.