icinga2 icon indicating copy to clipboard operation
icinga2 copied to clipboard

Gracefully stop/restart/reload icinga2

Open manfredw opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe.

Sometimes it is necessary to stop or restart/reload icinga2 (configuration changes, OS updates, crashes caused by communication problemes between icinga2 nodes,...), these are usualy triggered by Director or on CLI.

This will kills all currently running processes with signal 15 (SIGTERM): checks, notifications and icinga itself. It seems that killed checks and notifications are marked as timed out and this state is stored persistant (internal and in IDO?).

After a restart all this hardly interupted checks are shown as UNKNOWN and you have to wait for the next regular scheduled check to gain a "real" status. This sometimes leads to confusing states and dashboards with hundreds of apparent of problems.

Describe the solution you'd like

Implement a graceful shutdown of running check and notification scripts by waiting a (configurable) time for completion, prevent starting new scripts by disabling the scheduler.

Describe alternatives you've considered

Do not store state information of hardly killed scripts during shutdown process.

Additional context

Medium to large deployment with (redundant) satellites, systems running latest releases on Linux OS.

manfredw avatar Mar 21 '22 12:03 manfredw