god
god copied to clipboard
god does not receive ProcessExits event when running as a server.
I have a simple test with the following config file "simple.god":
God.watch do |w|
w.name = "simple"
w.start = "ping google.com"
w.keepalive
w.interval = 1.seconds
end
If I run god in non-daemonized mode with god -c simple.god -D and then run ps aux | grep ping | awk '{print $2}' | xargs kill, I can see in the output that it detects the killed process and restarts it:
I [2016-05-25 19:41:42] INFO: simple [trigger] process 3418 exited (ProcessExits)
I [2016-05-25 19:41:42] INFO: simple move 'up' to 'start'
I [2016-05-25 19:41:42] INFO: simple deregistered 'proc_exit' event for pid 3418
I [2016-05-25 19:41:42] INFO: simple start: ping google.com
I [2016-05-25 19:41:42] INFO: simple moved 'up' to 'start'
I [2016-05-25 19:41:42] INFO: simple [trigger] process is running (ProcessRunning)
However, if I run god in daemonized mode and try the same command to kill all ping processes, god does not detect that the process has been killed:
I [2016-05-25 19:43:35] INFO: simple move 'init' to 'start'
I [2016-05-25 19:43:35] INFO: simple start: ping google.com
I [2016-05-25 19:43:35] INFO: simple moved 'init' to 'start'
I [2016-05-25 19:43:35] INFO: simple [trigger] process is running (ProcessRunning)
I [2016-05-25 19:43:35] INFO: simple move 'start' to 'up'
I [2016-05-25 19:43:35] INFO: simple registered 'proc_exit' event for pid 3473
I [2016-05-25 19:43:35] INFO: simple moved 'start' to 'up'
<...Nothing more gets printed here...>
After killing all ping processes, god still thinks that "simple" is up.
I'm not sure what could be wrong since it works perfectly in non-daemonized mode.
Setup info:
god version: 0.13.7 OS: OS X 10.11.1
I discovered this work-around from someone at work: You have to put this block inside the main watch block:
w.start_if do |start|
start.condition(:process_running) do |c|
c.running = false
end
end
So the final working config is:
God.watch do |w|
w.name = "simple"
w.start = "ping google.com"
w.keepalive
w.interval = 1.seconds
w.start_if do |start|
start.condition(:process_running) do |c|
c.running = false
end
end
end
No idea why you need that when w.keepalive is already set, but it makes it work as expected.