god
god copied to clipboard
incorrect status "start" when the process is "up"
Senario
- I have delayed_jobs killed
- I run
god -c /etc/god.conf -D(normally without the -D but this is for show) - I get the following
I [2014-05-22 00:05:19] INFO: delayed_job.0 start: RAILS_ENV=staging4 bundle exec ./script/delayed_job -n 2 start
I [2014-05-22 00:05:45] INFO: delayed_job.0 moved 'init' to 'start'
I [2014-05-22 00:05:45] INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:45] INFO: delayed_job.0 [ok] tries within bounds [1/5] (Tries)
I [2014-05-22 00:05:53] INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:53] INFO: delayed_job.0 [ok] tries within bounds [2/5] (Tries)
I [2014-05-22 00:06:01] INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:01] INFO: delayed_job.0 [ok] tries within bounds [3/5] (Tries)
I [2014-05-22 00:06:09] INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:09] INFO: delayed_job.0 [ok] tries within bounds [4/5] (Tries)
I [2014-05-22 00:06:17] INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:17] INFO: delayed_job.0 [trigger] tries exceeded [5/5] (Tries)
I [2014-05-22 00:06:17] INFO: delayed_job.0 move 'start' to 'start'
I [2014-05-22 00:06:17] INFO: delayed_job.0 before_start: no pid file to delete (CleanPidFile)
I [2014-05-22 00:06:17] INFO: delayed_job.0 start: RAILS_ENV=staging4 bundle exec ./script/delayed_job -n 2 start
I run god status it looks like the process is in the start status
~/apps/main_app$ god status
delayed_job:
delayed_job.0: start
but the PID's are there and the processes are running. Why doesn't god think this is 'up'?
Here is my god.conf
RAILS_ROOT = "/home/xyz/apps/main_app/current"
RUBY_BIN = '/home/xyz/.rvm/rubies/ruby-2.0.0-p247/bin/ruby'
# /home/backops/apps/main_app/current/god/staging4/delayed_job.god
1.times do |num|
God.watch do |w|
w.name = "delayed_job.#{num}"
w.group = 'delayed_job'
w.interval = 300.seconds
w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid")
w.dir = RAILS_ROOT
w.start = "RAILS_ENV=staging4 bundle exec ./script/delayed_job -n 2 start "
## NOTE: do not specify uid or gid when not a root user
# https://github.com/mojombo/god/issues/43#issuecomment-1225470
# w.uid = 'xyz'
# w.gid = 'xyz'
# clean pid files before start if necessary
w.behavior(:clean_pid_file)
# restart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 1500.megabytes
c.times = 2
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
end
When this happens could you check if File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") exists?
It sounds like it doesn't.
After some debugging the PID files did exist but I think god was looking for them in the wrong place. I removed w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") and everything started working. I think there is still an issue but I found a way around it.
BTW: putting some print statements in the gem I found right before you check for !active? the pid_file was nil (or empty string)
The decision to or not to specify w.pid_file has to do with if god should be responsible for daemonizing the process or if the w.start command is doing to daemonize (and write the pid file) itself.
If you specify w.pid_file god will expect the w.start command to create it.
I could be wrong I think this was my use case:
With w.pid_file * w.start specified & no PID's running I call:
god -c /etc/god.conf
- Then the PID's are in the location specified in
w.pid_file. w.startwas the command that created the PID's (is that true?).- pid_file was nil (or an empty string) judging from the debugging I did.
If you think I messed up someplace else feel free to close. I hope this helps otherwise.
Either way all is working now after I removed w.pid_file... The pid files are someplace else but I'm ok with that.