Check_MK-Treasures icon indicating copy to clipboard operation
Check_MK-Treasures copied to clipboard

get_cluster_status off by one error

Open archer31 opened this issue 1 year ago • 7 comments

the get_cluster_status function checks that the services are running/not dead. but it appears that the check makes sure that there are at least 2 pids for each service rather than at least one. Is there a reason behind checking for 2 over one, because the plugin is reporting failures when there is none?

archer31 avatar Aug 29 '23 00:08 archer31

Hello, which check you talking about?

Bastian-Kuhn avatar Aug 29 '23 08:08 Bastian-Kuhn

This is for the Node Status check. My systems report that these four services are failed: logwatcher, nexus, statscollector, tricorder. But when i go to the actual system, systemctl reports that the services are running.

archer31 avatar Aug 29 '23 20:08 archer31

Hello,

in this repo is no check with node in name. I would need the plugin name.

Bastian-Kuhn avatar Sep 01 '23 06:09 Bastian-Kuhn

I am talking about this line here

archer31 avatar Sep 07 '23 18:09 archer31

Thank you, is fixed. If you please could test with the newest version?

Bastian-Kuhn avatar Sep 08 '23 07:09 Bastian-Kuhn

looks like that does not fix the issue. i am getting a parse failed error Parsing of section cohesity_node_status failed WARN here is the output of that section. possibly is an issue with the failed line not listing any services.

<<<cohesity_node_status>>>
host4 failed 
host4 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host3 failed 
host3 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host2 failed 
host2 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda
host1 failed 
host1 ok alerts,apollo,athena,atom,bifrost_broker,bridge,bridge_proxy,eagle_agent,gandalf,groot,iris,iris_proxy,keychain,librarian,logwatcher,magneto,newscribe,nexus,nexus_proxy,patch,rtclient,smb2_proxy,smb_proxy,stats,statscollector,storage_proxy,tricorder,vault_proxy,yoda

This could probably be fixed just by adding some conditions to these two lines here

archer31 avatar Sep 26 '23 00:09 archer31

But how should the check Handle this failed state? Adding conditions would just ignore the error or not?

Bastian-Kuhn avatar Oct 19 '23 08:10 Bastian-Kuhn

Hello @archer31 that should now be finally fixed.

Bastian-Kuhn avatar Aug 30 '24 11:08 Bastian-Kuhn