watchd
watchd copied to clipboard
watchd is a process watcher / monitor, so watchd is a process watch dog to keep the watched processes alive.
watchd is a process watch dog to keep the watched processes alive.
the sample config file please see src/sample.conf.
config fragment in src/sample.conf:
one section for a group sub processes
you can config more than one sections
[test1]
the type: daemon or cron (for crontab)
default: daemon
type = daemon
the mode: all or failover
all: run all subprocess_command
failover: run the first command,
run the second command when the first exit,
run the first command again when the last exit,
and so on.
default: all
mode = failover
if run by shell as: sh -c command
true or 1: run with sh -c command
false or 0: run command directly
auto: auto detect if need run by sh -c
default auto
run_by_sh = auto
force restart interval in seconds
this parameter only for daemon
default 0 for never force restart
force_restart_interval = 86400
set environment variable
can ocur more than once
set_env = OMP_NUM_THREADS=4
the command line of sub process
can be a simple command line or a shell command as following four formats:
1. (command)
2. command > output_filename or command >> output_filename
3. command1 | command2 ...
4. command &
the shell command line while be exec as: sh -c command_line
subprocess_command = /bin/echo OMP_NUM_THREADS=$OMP_NUM_THREADS $host_index >> /tmp/echo.log
check sub process alive interval in seconds
0 for never check
check_alive_interval = 10
retry threshold of check alive
kill the sub process when the check fail count exceeds this parameter
default: 3
check_alive_retry_threshold = 3
check_alive_command can be a command or a library whose filename ends with .so
the check command output OK for check passed, others for fail
the library must export c function:
int check_alive(int argc, char **argv);
argv[0] is the library filename, argv[1] is the first parameter and so on.
return 0 for success, != 0 for fail.
for example:
#@function REPLACE_VARS
check_alive_command = /usr/local/lib/libdfscheckalive.so %{encoder_port} 2 30
#@function REPLACE_VARS check_alive_command = echo OK
[test2] type = cron subprocess_command = ls -l / >> /tmp/ls.log
the time base to schedule
this parameter only for cron
time_base = 00:00
repeat interval in seconds
this parameter only for cron
repeat_interval = 60