icinga2
icinga2 copied to clipboard
(scheduled) Downtimes - Notifications are not suppressed
Describe the bug
Notifications are not suppressed during (scheduled) Downtimes.
To Reproduce
- Create a scheduled Downtime
- Wait for a State change
- Notification will be send
Expected behavior
Notification will be suppressed.
Screenshots
Your Environment
Include as many relevant details about the environment you experienced the problem in
- Version used (
icinga2 --version
): r2.13.6-1 - Operating System and version: CentOS 7.9.2009
- Enabled features (
icinga2 feature list
): api checker ido-mysql influxdb mainlog notification - Icinga Web 2 version and modules (System - About): 2.11.3
- Config validation (
icinga2 daemon -C
): completes without errors. - If you run multiple Icinga 2 instances, the
zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.: HA Master + Agents (Details can not be shared publicly)
Additional context
Maybe helpful: max_check_attempt
is set to 1 for this check.
If you run multiple Icinga 2 instances, the
zones.conf
file (oricinga2 object list --type Endpoint
andicinga2 object list --type Zone
) from all affected nodes.: HA Master + Agents (Details can not be shared publicly)
Can you please share at least some more details on the structure, in particular:
- Are the agents connected directly to the masters?
- Is the check executed using
command_endpoint
?
Yes, the agents are directly connected to the masters. We don't have any satellites in this enviroment.
Yes, the check is executed using command_endpoint
Any news or ideas what's happening here?
This might be unrelated to this issue, but we are seeing issues with notifications being sent when a host should be in downtime after a config change is made. If a host or service problem is acknowledged, or put into scheduled downtime, and then a configuration change is made via Icinga Director, those acks and downtimes are purged - this occurs regardless of what zone the Director change is made in.
e.g. I ack a host problem in for host-a
in zone-a
, create a scheduled downtime for host-b
in zone-b
, and then push a Director config change for host-c
in zone-c
- the ack's for host-a
and downtime for host-b
are removed, but are in the history.
Additionally, when the Icinga master reloads config after a Director deployment, we are seeing a race condition that causes hosts to send down notifications, and a few minutes later, enter downtime:
Icinga Web 2 Version 2.11.4 Icinga2 Version r2.13.7-1
Is there a possibility to get feedback on this topic?
My first guess would be that there could be some inconsistency between both masters. While inside the downtime, you can request https://localhost:5665/v1/objects/services/affected-host-name!affected-service-name
from both masters and compare what you get. downtime_depth
would be of particular interest as this shows if both masters agree on whether the service is in a downtime.
Thanks for your answer. Both masters are in sync (downtime_depth is 1 if a service is in downtime). I have also already cleaned /var/lib/icinga2/api/zones/
several times on the second master to get a fresh sync from the config master.
Something like what @0xliam describes happens in our setup from time to time as well.
The config deployment by the Director was triggered at 18:00 At 18:01:45 configs were synced between the masters (with the config master ignoring the updates from the secondary master) and into the zones. That all was finished at 18:02:10 Between 18:02:10 and 18:02:40 multiple downtimes where created and those successfully suppressed notifications. At 18:02:41 a whole lot of "Syncing configuration files for xyz to " messages (re)appear in the log without a config deployment being triggered, only for the masters The downtime from the screenshot was entered at 18:02:43
Log line for the host that was notified during the supposed downtime
### DOWNTIME ENTERED VIA API ###
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!memory-toplist!13960866-f66f-4804-b527-625b31b85818' for checkable 'xyz-p1-ts2004!memory-toplist'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!memory-toplist!13960866-f66f-4804-b527-625b31b85818' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!memory_free_VD' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!memory_free_VD!bfa5736a-5b2d-4061-8c2b-6ab3290d508c' for checkable 'xyz-p1-ts2004!memory_free_VD'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!memory_free_VD!bfa5736a-5b2d-4061-8c2b-6ab3290d508c' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!pending_updates' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!pending_updates!849fa58a-fbf2-43cf-97c7-cd6cd5c83a5a' for checkable 'xyz-p1-ts2004!pending_updates'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!pending_updates!849fa58a-fbf2-43cf-97c7-cd6cd5c83a5a' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!pending_updates_security-only' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!pending_updates_security-only!8f643388-a8e4-40dc-80f1-c57c97fdbce3' for checkable 'xyz-p1-ts2004!pending_updates_security-only'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!pending_updates_security-only!8f643388-a8e4-40dc-80f1-c57c97fdbce3' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!rdp-x224-status' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!rdp-x224-status!024afb09-1141-47a4-8398-de52c761102d' for checkable 'xyz-p1-ts2004!rdp-x224-status'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!rdp-x224-status!024afb09-1141-47a4-8398-de52c761102d' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!sentinelone-agent-status' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone-agent-status!d1fd675e-8976-41cd-855e-14d557cf7cd6' for checkable 'xyz-p1-ts2004!sentinelone-agent-status'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone-agent-status!d1fd675e-8976-41cd-855e-14d557cf7cd6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone_application_security!8ed4fb6c-989c-4040-8c72-d0e30f2e73a6' for checkable 'xyz-p1-ts2004!sentinelone_application_security'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone_application_security!8ed4fb6c-989c-4040-8c72-d0e30f2e73a6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!sentinelone_threats' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!sentinelone_threats!f3f5f9fe-2b39-41d1-8dbc-43601c96fba0' for checkable 'xyz-p1-ts2004!sentinelone_threats'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!sentinelone_threats!f3f5f9fe-2b39-41d1-8dbc-43601c96fba0' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-dcomlaunch' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-dcomlaunch!3d0ef4ab-a08d-4b44-961a-c6b512c13bd8' for checkable 'xyz-p1-ts2004!service-dcomlaunch'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-dcomlaunch!3d0ef4ab-a08d-4b44-961a-c6b512c13bd8' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-eventlog' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-eventlog!06e304df-196f-479f-8f1e-7a577bb46b01' for checkable 'xyz-p1-ts2004!service-eventlog'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-eventlog!06e304df-196f-479f-8f1e-7a577bb46b01' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-frxsvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-frxsvc!a321407e-ef41-43fb-a890-16e1d46d6a0f' for checkable 'xyz-p1-ts2004!service-frxsvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-frxsvc!a321407e-ef41-43fb-a890-16e1d46d6a0f' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-gpsvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-gpsvc!3dd5276b-a26c-43c0-96ea-fd3149409c6a' for checkable 'xyz-p1-ts2004!service-gpsvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-gpsvc!3dd5276b-a26c-43c0-96ea-fd3149409c6a' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-lanmanserver' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-lanmanserver!c0edc4d2-5989-4378-8e2f-d2c81d63ba84' for checkable 'xyz-p1-ts2004!service-lanmanserver'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-lanmanserver!c0edc4d2-5989-4378-8e2f-d2c81d63ba84' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-lanmanworkstation' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-lanmanworkstation!ba0dccab-c0a3-48ca-9bca-b4fe80d63b6d' for checkable 'xyz-p1-ts2004!service-lanmanworkstation'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-lanmanworkstation!ba0dccab-c0a3-48ca-9bca-b4fe80d63b6d' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-logprocessorservice' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-logprocessorservice!7f79485e-4410-4780-b444-67d4a1ed4200' for checkable 'xyz-p1-ts2004!service-logprocessorservice'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-logprocessorservice!7f79485e-4410-4780-b444-67d4a1ed4200' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-mpssvc' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-mpssvc!ffaf3b73-dda1-440c-93fb-8296341594a6' for checkable 'xyz-p1-ts2004!service-mpssvc'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-mpssvc!ffaf3b73-dda1-440c-93fb-8296341594a6' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-rdagentbootloader' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-rdagentbootloader!87c04bc6-c807-4bdd-ab85-e07d2878d1dc' for checkable 'xyz-p1-ts2004!service-rdagentbootloader'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-rdagentbootloader!87c04bc6-c807-4bdd-ab85-e07d2878d1dc' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-rpcss' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-rpcss!f14e490e-9a7e-4270-932b-d8fd7df81396' for checkable 'xyz-p1-ts2004!service-rpcss'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-rpcss!f14e490e-9a7e-4270-932b-d8fd7df81396' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-schedule' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-schedule!ca162fec-1260-43de-a524-cc17e2d2d869' for checkable 'xyz-p1-ts2004!service-schedule'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-schedule!ca162fec-1260-43de-a524-cc17e2d2d869' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-sentinelagent' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-sentinelagent!9fd29be4-2d24-41b1-83e0-3bf3063ecad4' for checkable 'xyz-p1-ts2004!service-sentinelagent'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-sentinelagent!9fd29be4-2d24-41b1-83e0-3bf3063ecad4' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-sentinelstaticengine' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-sentinelstaticengine!70b79481-44e0-4e8c-85f0-dd49b5c4de3c' for checkable 'xyz-p1-ts2004!service-sentinelstaticengine'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-sentinelstaticengine!70b79481-44e0-4e8c-85f0-dd49b5c4de3c' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-winmgmt' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-winmgmt!3b77eab9-b3f0-46f6-b5fd-462848b0c4ce' for checkable 'xyz-p1-ts2004!service-winmgmt'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-winmgmt!3b77eab9-b3f0-46f6-b5fd-462848b0c4ce' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!service-winrm' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!service-winrm!fc171c9b-2109-401c-a196-51844220b4f8' for checkable 'xyz-p1-ts2004!service-winrm'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!service-winrm!fc171c9b-2109-401c-a196-51844220b4f8' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!software_inventory!3334eae7-e6a7-4186-8345-890e5a069cc5' for checkable 'xyz-p1-ts2004!software_inventory'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!software_inventory!3334eae7-e6a7-4186-8345-890e5a069cc5' of type 'Downtime'.
[2024-08-07 18:02:48 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!userprofile-containers' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:02:48 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!userprofile-containers!422d1ce1-620f-45fd-b868-b996867d2610' for checkable 'xyz-p1-ts2004!userprofile-containers'.
[2024-08-07 18:02:48 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!userprofile-containers!422d1ce1-620f-45fd-b868-b996867d2610' of type 'Downtime'.
### NOTIFICATION WAS SENT ###
[2024-08-07 18:08:23 +0200] information/Checkable: Checkable 'xyz-p1-ts2004' has 1 notification(s). Checking filters for type 'Problem', sends will be logged.
[2024-08-07 18:08:25 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!90d985c3-0bc3-40f4-8cfc-00fbb741fdbc' of type 'Comment'.
[2024-08-07 18:08:25 +0200] information/Checkable: Acknowledgement set for checkable 'xyz-p1-ts2004'.
### DOWNTIME WAS STARTED WITH A DELAY of ~ 50 minutes ###
[2024-08-07 18:51:11 +0200] information/Checkable: Checkable 'xyz-p1-ts2004' has 1 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!ef6293dc-916f-4db8-8e4b-aff0c37744ef' for checkable 'xyz-p1-ts2004'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!ef6293dc-916f-4db8-8e4b-aff0c37744ef' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!cpu!b98242bc-a9ec-4561-ac64-3292c0779221' for checkable 'xyz-p1-ts2004!cpu'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!cpu!b98242bc-a9ec-4561-ac64-3292c0779221' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!cpu-toplist!9efb9f31-2a70-4b6f-a454-92ba04977230' for checkable 'xyz-p1-ts2004!cpu-toplist'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!cpu-toplist!9efb9f31-2a70-4b6f-a454-92ba04977230' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Checkable: Checkable 'xyz-p1-ts2004!disk' has 2 notification(s). Checking filters for type 'DowntimeStart', sends will be logged.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!disk!1619bb9b-96ed-432b-89f5-29774260bfd0' for checkable 'xyz-p1-ts2004!disk'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!disk!1619bb9b-96ed-432b-89f5-29774260bfd0' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!icinga-agent-parent-service!e646196c-02dd-4ca6-a76c-28c87e3aa1cc' for checkable 'xyz-p1-ts2004!icinga-agent-parent-service'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!icinga-agent-parent-service!e646196c-02dd-4ca6-a76c-28c87e3aa1cc' of type 'Downtime'.
[2024-08-07 18:51:11 +0200] information/Downtime: Triggering downtime 'xyz-p1-ts2004!icinga-agent-version!bd76f856-586e-47ab-9d9e-880893090628' for checkable 'xyz-p1-ts2004!icinga-agent-version'.
[2024-08-07 18:51:11 +0200] information/ConfigObjectUtility: Created and activated object 'xyz-p1-ts2004!icinga-agent-version!bd76f856-586e-47ab-9d9e-880893090628' of type 'Downtime'.
More occurences of this.
"Light mode" is a downtime set via API on host shutdown.
"Dark mode" is a scheduled downtime.
Debug logs can be provided if helpful!