icinga2 icon indicating copy to clipboard operation
icinga2 copied to clipboard

Notifications sent after a huge delay

Open apollo13 opened this issue 2 years ago • 3 comments

Describe the bug

I am having a hard time to describe it, since I am not sure what exactly is happening. What we have gathered so far is that sometimes notifications are sent with a massive delay.

Take this example: image As you can see the service recovered on 07.11 15:17. The service history shows no changes till 08.11 09:28 at which point the notification is sent: image We have no ideas what causes this delay. When we trigger a forced notification or manually submit check results to make the service critical it immediately sent the notification (at least when we tried this).

We are probably seeing this since a while and suspect that upgrading to Icinga 2.14 might have caused this.

To Reproduce

We are sadly not able to reproduce this.

Expected behavior

Notifications should be sent immediately

Screenshots

See above

Your Environment

Include as many relevant details about the environment you experienced the problem in

  • Version used (icinga2 --version): r2.14.0-1
  • Operating System and version: CentOS 7
  • Enabled features (icinga2 feature list): api checker graphite ido-pgsql mainlog notification
  • Icinga Web 2 version and modules (System - About): 2.12.0
  • Config validation (icinga2 daemon -C):
[2023-11-08 11:30:52 +0100] information/cli: Icinga application loader (version: r2.14.0-1)
[2023-11-08 11:30:52 +0100] information/cli: Loading configuration file(s).
[2023-11-08 11:30:52 +0100] information/ConfigItem: Committing config item(s).
[2023-11-08 11:30:52 +0100] information/ApiListener: My API identity: mgmt1
[2023-11-08 11:30:52 +0100] warning/ApplyRule: Apply rule '' (in /etc/icinga2/zones.d/master/services.conf: 26:1-26:65) for type 'Service' does not match anywhere!
[2023-11-08 11:30:52 +0100] warning/ApplyRule: Apply rule 'backup-downtime' (in /etc/icinga2/zones.d/global-templates/downtimes.conf: 5:1-5:52) for type 'ScheduledDowntime' does not match anywhere!
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 IdoPgsqlConnection.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 9 Users.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 UserGroup.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 7 ServiceGroups.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 49 Zones.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 939 Services.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 5 NotificationCommands.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 2005 Notifications.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 60 Hosts.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 5 HostGroups.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 47 Endpoints.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 92 Dependencies.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 2 ApiUsers.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 253 CheckCommands.
[2023-11-08 11:30:52 +0100] information/ConfigItem: Instantiated 1 GraphiteWriter.
[2023-11-08 11:30:52 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-11-08 11:30:52 +0100] information/cli: Finished validating the configuration file(s).
  • Direkt connection between icinga agents and a single master.

Additional context

Add any other context about the problem here.

apollo13 avatar Nov 08 '23 10:11 apollo13

Hello Florian!

You have 3 TimePeriods and 92 Dependencies. Are any TimePeriods configured on notifications of the checkables in question? Also, Dependency#disable_notifications defaults to true. Does that checkable have any dependencies?

Best, A/K

Al2Klimov avatar Apr 09 '24 14:04 Al2Klimov

Hi @Al2Klimov,

thank you for your reply. Nope, none of the time periods are configured on the service and none of the dependencies match here. The service in question runs with a check interval of 100m -- could that have any effect?

Thanks, Florian

apollo13 avatar Apr 09 '24 14:04 apollo13

Then, I'm afraid, the best you can do is enabling debug log and provide it for when e.g. the service recovers and when the delayed notification is sent.

Al2Klimov avatar May 14 '24 14:05 Al2Klimov