god icon indicating copy to clipboard operation
god copied to clipboard

God gets stuck in "start" state and never restarts process

Open tetherit opened this issue 11 years ago • 2 comments

The output from god looks as follows:

I [2012-10-17 19:08:34]  INFO: living-recorder [ok] process is not running (ProcessRunning)
I [2012-10-17 19:08:39]  INFO: living-recorder [ok] process is not running (ProcessRunning)
I [2012-10-17 19:08:44]  INFO: living-recorder [ok] process is not running (ProcessRunning)
I [2012-10-17 19:08:49]  INFO: living-recorder [ok] process is not running (ProcessRunning)
I [2012-10-17 19:08:54]  INFO: living-recorder [ok] process is not running (ProcessRunning)
I [2012-10-17 19:08:59]  INFO: living-recorder [ok] process is not running (ProcessRunning)

When I look at god status, I see:

kitchen:
  kitchen-detector: up
  kitchen-proxy: up
  kitchen-recorder: up
living:
  living-detector: up
  living-proxy: up
  living-recorder: start
...

Notice the recorder is in "start" state, where it stays forever.

Looking at the log file, I see that yes, the process failed to start because the Proxy was not yet running (also started by god) but God makes no attempt to restart the failed process.

This is what's in my god file

...

Cameras.each do |camera|

  ...

  God.watch do |w|
    w.name = "#{camera[:name]}-recorder"
    w.group = camera[:name]
    w.log = "#{APP_ROOT}/logs/#{camera[:name]}-recorder.log"
    w.start = "ffmpeg -i #{proxy_url} -loglevel warning" \
              " -analyzeduration 0 -map 0 -codec:v copy" \
              " -codec:a libfaac -ar 44100 -ab 64k" \
              " -f segment -segment_time 60 -segment_wrap 10" \
              " -segment_list '#{APP_ROOT}/output/#{camera[:name]}.csv'" \
              " -y '#{APP_ROOT}/output/#{camera[:name]}_%02d.mkv'"
    w.keepalive(
      :memory_max => 100.megabytes,
      :cpu_max => 50.percent)

  end

 ...

end

if I then run:

god monitor living-recorder

It starts the process up no problem and I see this in the logs:

I [2012-10-17 19:12:19]  INFO: living-recorder move 'start' to 'init'
I [2012-10-17 19:12:19]  INFO: living-recorder moved 'start' to 'init'
I [2012-10-17 19:12:19]  INFO: living-recorder [trigger] process is not running (ProcessRunning)
I [2012-10-17 19:12:19]  INFO: living-recorder move 'init' to 'start'
I [2012-10-17 19:12:19]  INFO: living-recorder start: ffmpeg -i rtsp://127.0.0.1:10103/proxyStream -loglevel warning -analyzeduration 0 -map 0 -codec:v copy -codec:a libfaac -ar 44100 -ab 64k -f segment -segment_time 60 -segment_wrap 10 -segment_list '/Users/hackeron/Development/output/living.csv' -y '/Users/hackeron/Development/output/living_%02d.mkv'
I [2012-10-17 19:12:19]  INFO: living-recorder moved 'init' to 'start'
I [2012-10-17 19:12:19]  INFO: living-recorder move 'start' to 'up'
I [2012-10-17 19:12:19]  INFO: living-recorder registered 'proc_exit' event for pid 11852
I [2012-10-17 19:12:19]  INFO: living-recorder moved 'start' to 'up'

What makes god get stuck in the "start" state until I manually run god monitor?

I'm running god 0.13.1 on ruby 1.9.3p194 on Mountain Lion.

tetherit avatar Oct 17 '12 18:10 tetherit

You'll need to use something other than keepalive (manually configure the events yourself)

God will not re-transition to start but is waiting for your process to indicate that it started.

You're transition state should contain a tries block that will reset the transition to start again to ensure your process

# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
  ...

  # failsafe
  on.condition(:tries) do |c|
    c.times = 5
    c.transition = :start
  end
end

danshultz avatar Jun 12 '13 18:06 danshultz

This full example helped in my case:

God.watch do |w|
  ...
  w.keepalive

  w.transition([:start, :restart], :up) do |on|
    on.condition(:process_running) do |c|
      c.interval = 5.seconds
      c.running = true
    end

    on.condition(:tries) do |c|
      c.interval = 5.seconds
      c.times = 5
      c.transition = :start
    end
  end
end

astery avatar Jun 28 '19 09:06 astery