monitoring icon indicating copy to clipboard operation
monitoring copied to clipboard

Question RE: Failure to remediate

Open phobos182 opened this issue 12 years ago • 1 comments

During a failure condition, what is Watchdog's stance on failure to remedy. Lets say I have an Incident attached to a service. It trips the threshold, and for some reason the code cannot fix the underlying condition (Service fails to start, etc..). Currently WatchDog just keeps trying forever. Is this the intended use case, or are there any plans for a maximum attempt / give up / bail on this monitor.

phobos182 avatar Oct 22 '12 22:10 phobos182

I'm testing some DSL extensions to 'unmonitor' a service. Examples here. https://github.com/phobos182/watchdog-examples/blob/master/watchdog.py

phobos182 avatar Oct 23 '12 14:10 phobos182