god icon indicating copy to clipboard operation
god copied to clipboard

god does not receive ProcessExits event when running as a server.

Open ghedsouza opened this issue 9 years ago • 1 comments

I have a simple test with the following config file "simple.god":

God.watch do |w|
  w.name = "simple"
  w.start = "ping google.com"
  w.keepalive
  w.interval = 1.seconds
end

If I run god in non-daemonized mode with god -c simple.god -D and then run ps aux | grep ping | awk '{print $2}' | xargs kill, I can see in the output that it detects the killed process and restarts it:

I [2016-05-25 19:41:42]  INFO: simple [trigger] process 3418 exited (ProcessExits)
I [2016-05-25 19:41:42]  INFO: simple move 'up' to 'start'
I [2016-05-25 19:41:42]  INFO: simple deregistered 'proc_exit' event for pid 3418
I [2016-05-25 19:41:42]  INFO: simple start: ping google.com
I [2016-05-25 19:41:42]  INFO: simple moved 'up' to 'start'
I [2016-05-25 19:41:42]  INFO: simple [trigger] process is running (ProcessRunning)

However, if I run god in daemonized mode and try the same command to kill all ping processes, god does not detect that the process has been killed:

I [2016-05-25 19:43:35]  INFO: simple move 'init' to 'start'
I [2016-05-25 19:43:35]  INFO: simple start: ping google.com
I [2016-05-25 19:43:35]  INFO: simple moved 'init' to 'start'
I [2016-05-25 19:43:35]  INFO: simple [trigger] process is running (ProcessRunning)
I [2016-05-25 19:43:35]  INFO: simple move 'start' to 'up'
I [2016-05-25 19:43:35]  INFO: simple registered 'proc_exit' event for pid 3473
I [2016-05-25 19:43:35]  INFO: simple moved 'start' to 'up'

<...Nothing more gets printed here...>

After killing all ping processes, god still thinks that "simple" is up.

I'm not sure what could be wrong since it works perfectly in non-daemonized mode.

Setup info:

god version: 0.13.7 OS: OS X 10.11.1

ghedsouza avatar May 25 '16 23:05 ghedsouza

I discovered this work-around from someone at work: You have to put this block inside the main watch block:

  w.start_if do |start|
    start.condition(:process_running) do |c|
      c.running = false
    end
  end

So the final working config is:

God.watch do |w|
  w.name = "simple"
  w.start = "ping google.com"
  w.keepalive
  w.interval = 1.seconds

  w.start_if do |start|
    start.condition(:process_running) do |c|
      c.running = false
    end
  end
end

No idea why you need that when w.keepalive is already set, but it makes it work as expected.

ghedsouza avatar May 26 '16 17:05 ghedsouza