god icon indicating copy to clipboard operation
god copied to clipboard

incorrect status "start" when the process is "up"

Open drhenner opened this issue 11 years ago • 5 comments

Senario

  • I have delayed_jobs killed
  • I run god -c /etc/god.conf -D (normally without the -D but this is for show)
  • I get the following
I [2014-05-22 00:05:19]  INFO: delayed_job.0 start: RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start 
I [2014-05-22 00:05:45]  INFO: delayed_job.0 moved 'init' to 'start'
I [2014-05-22 00:05:45]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:45]  INFO: delayed_job.0 [ok] tries within bounds [1/5] (Tries)
I [2014-05-22 00:05:53]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:05:53]  INFO: delayed_job.0 [ok] tries within bounds [2/5] (Tries)
I [2014-05-22 00:06:01]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:01]  INFO: delayed_job.0 [ok] tries within bounds [3/5] (Tries)
I [2014-05-22 00:06:09]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:09]  INFO: delayed_job.0 [ok] tries within bounds [4/5] (Tries)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 [ok] process is not running (ProcessRunning)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 [trigger] tries exceeded [5/5] (Tries)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 move 'start' to 'start'
I [2014-05-22 00:06:17]  INFO: delayed_job.0 before_start: no pid file to delete (CleanPidFile)
I [2014-05-22 00:06:17]  INFO: delayed_job.0 start: RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start

I run god status it looks like the process is in the start status

~/apps/main_app$ god status
delayed_job:
  delayed_job.0: start

but the PID's are there and the processes are running. Why doesn't god think this is 'up'?

Here is my god.conf

RAILS_ROOT = "/home/xyz/apps/main_app/current"
RUBY_BIN   = '/home/xyz/.rvm/rubies/ruby-2.0.0-p247/bin/ruby'
# /home/backops/apps/main_app/current/god/staging4/delayed_job.god

1.times do |num|
  God.watch do |w|
    w.name = "delayed_job.#{num}"
    w.group = 'delayed_job'
    w.interval = 300.seconds
    w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid")
    w.dir      = RAILS_ROOT
    w.start    = "RAILS_ENV=staging4  bundle exec ./script/delayed_job -n 2 start "

    ##  NOTE: do not specify uid or gid when not a root user
    #   https://github.com/mojombo/god/issues/43#issuecomment-1225470
    # w.uid = 'xyz'
    # w.gid = 'xyz'

    # clean pid files before start if necessary
    w.behavior(:clean_pid_file)

    # restart if memory gets too high
    w.transition(:up, :restart) do |on|
      on.condition(:memory_usage) do |c|
        c.above = 1500.megabytes
        c.times = 2
      end
    end

    # determine the state on startup
    w.transition(:init, { true => :up, false => :start }) do |on|
      on.condition(:process_running) do |c|
        c.running = true
      end
    end

    # determine when process has finished starting
    w.transition([:start, :restart], :up) do |on|
      on.condition(:process_running) do |c|
        c.running = true
        c.interval = 5.seconds
      end

      # failsafe
      on.condition(:tries) do |c|
        c.times = 5
        c.transition = :start
        c.interval = 5.seconds
      end
    end

    # start if process is not running
    w.transition(:up, :start) do |on|
      on.condition(:process_running) do |c|
        c.running = false
      end
    end
  end
end

drhenner avatar May 22 '14 00:05 drhenner

When this happens could you check if File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") exists?

It sounds like it doesn't.

eric avatar May 22 '14 01:05 eric

After some debugging the PID files did exist but I think god was looking for them in the wrong place. I removed w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job#{num}.pid") and everything started working. I think there is still an issue but I found a way around it.

drhenner avatar May 22 '14 18:05 drhenner

BTW: putting some print statements in the gem I found right before you check for !active? the pid_file was nil (or empty string)

drhenner avatar May 22 '14 18:05 drhenner

The decision to or not to specify w.pid_file has to do with if god should be responsible for daemonizing the process or if the w.start command is doing to daemonize (and write the pid file) itself.

If you specify w.pid_file god will expect the w.start command to create it.

eric avatar May 22 '14 18:05 eric

I could be wrong I think this was my use case:

With w.pid_file * w.start specified & no PID's running I call:

god -c /etc/god.conf
  • Then the PID's are in the location specified in w.pid_file.
  • w.start was the command that created the PID's (is that true?).
  • pid_file was nil (or an empty string) judging from the debugging I did.

If you think I messed up someplace else feel free to close. I hope this helps otherwise.

Either way all is working now after I removed w.pid_file... The pid files are someplace else but I'm ok with that.

drhenner avatar May 22 '14 21:05 drhenner