naemon-core icon indicating copy to clipboard operation
naemon-core copied to clipboard

enable_flap_detection=0 seems to be ignored

Open lausser opened this issue 2 years ago • 6 comments

I have a naemon 1.4.1 (OMD 5.21.20230729-labs-edition) where i disabled flap detection globally. But there is one service which is still shown as flapping (also in the logfile with [1690795047] SERVICE NOTIFICATION SUPPRESSED: snclient;os_windows_svc_check_Server;Notification blocked because the object is currently flapping.

In the retention date one can clearly see the two different settings:

info {
created=1690813918
version=1.4.1
}
program {
modified_host_attributes=0
modified_service_attributes=0
...
enable_flap_detection=0
...
}

service {
host_name=snclient
service_description=os_windows_svc_check_Server
modified_attributes=2
...
is_flapping=1
percent_state_change=0.00
...

Also Thruk shows the flapping symbol. Shouldn't everything be 0 when enable_flap_detection in the naemon.cfg was disabled?

lausser avatar Jul 31 '23 14:07 lausser

Sounds like a bug to me

nook24 avatar Aug 14 '23 08:08 nook24

Disabling enable_flap_detection does not reset all existing flapping flags. It simply prevents new flapping flags from being set. But i agree, it should not prevent notifications from being sent out.

sni avatar Aug 14 '23 09:08 sni

Isn't the flapping flag removed on the next check execution? If not, that would probably be sensible imo.

jacobbaungard avatar Aug 17 '23 07:08 jacobbaungard

I don't think this is done aleady, but yes, that might be the way to go.

sni avatar Aug 18 '23 07:08 sni

Thinking about it, simply resetting the flag might not be a good idea. The global enable_flap_detection flag can be (temporarily) changed by an external command. So this would result in lots of "starting to flap" notifications again.

One way would be to check the global flag at least in the notifications logic. Better than nothing, but this would still show the host/service as flapping in the UI unless the UI checks the global flag as well.

Naemon simply cannot predict whether the enable_flap_detection is just temporarily disabled or forever.

sni avatar Feb 06 '24 10:02 sni

Besides the quick fix from #452 we could think about slowly letting grow out the state history in case the host/service flapping flag is set, even if flapping detection is disabled. For example, each time a check result arrives, update the state history used for flapping detection with an OK untill the list is cleared. (But silently ignoring flapping stop notification)

sni avatar Feb 06 '24 11:02 sni