Notifications sent after a huge delay
Describe the bug
I am having a hard time to describe it, since I am not sure what exactly is happening. What we have gathered so far is that sometimes notifications are sent with a massive delay.
Take this example:
As you can see the service recovered on 07.11 15:17. The service history shows no changes till 08.11 09:28 at which point the notification is sent:
We have no ideas what causes this delay. When we trigger a forced notification or manually submit check results to make the service critical it immediately sent the notification (at least when we tried this).
We are probably seeing this since a while and suspect that upgrading to Icinga 2.14 might have caused this.
To Reproduce
We are sadly not able to reproduce this.
Expected behavior
Notifications should be sent immediately
Screenshots
See above
Your Environment
Include as many relevant details about the environment you experienced the problem in
- Version used (
icinga2 --version): r2.14.0-1 - Operating System and version: CentOS 7
- Enabled features (
icinga2 feature list): api checker graphite ido-pgsql mainlog notification - Icinga Web 2 version and modules (System - About): 2.12.0
- Config validation (
icinga2 daemon -C):
[2023-11-08 11:30:52 +0100] information/cli: Icinga application loader (version: r2.14.0-1)
[2023-11-08 11:30:52 +0100] information/cli: Loading configuration file(s).
[2023-11-08 11:30:52 +0100] information/ConfigItem: Committing config item(s).
[2023-11-08 11:30:52 +0100] information/ApiListener: My API identity: mgmt1
[2023-11-08 11:30:52 +0100] warning/ApplyRule: Apply rule '' (in /etc/icinga2/zones.d/master/services.conf: 26:1-26:65) for type 'Service' does not match anywhere!
[2023-11-08 11:30:52 +0100] warning/ApplyRule: Apply rule 'backup-downtime' (in /etc/icinga2/zones.d/global-templates/downtimes.conf: 5:1-5:52) for type 'ScheduledDowntime' does not match anywhere!
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 IdoPgsqlConnection.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 9 Users.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 UserGroup.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 7 ServiceGroups.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 49 Zones.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 939 Services.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 5 NotificationCommands.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 2005 Notifications.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 60 Hosts.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 5 HostGroups.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 47 Endpoints.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 92 Dependencies.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 2 ApiUsers.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 253 CheckCommands.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2023-11-08 11:30:52 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-11-08 11:30:52 +0100] information/cli: Finished validating the configuration file(s).
- Direkt connection between icinga agents and a single master.
Additional context
Add any other context about the problem here.
Hello Florian!
You have 3 TimePeriods and 92 Dependencies. Are any TimePeriods configured on notifications of the checkables in question? Also, Dependency#disable_notifications defaults to true. Does that checkable have any dependencies?
Best, A/K
Hi @Al2Klimov,
thank you for your reply. Nope, none of the time periods are configured on the service and none of the dependencies match here. The service in question runs with a check interval of 100m -- could that have any effect?
Thanks, Florian
Then, I'm afraid, the best you can do is enabling debug log and provide it for when e.g. the service recovers and when the delayed notification is sent.