nagios-plugins icon indicating copy to clipboard operation
nagios-plugins copied to clipboard

Inactive service reported as active

Open ajm83 opened this issue 6 years ago • 3 comments

When a service is down, the plugin is still returning status code of OKAY (zero).

root@vps3:~/icinga/nagios-plugins-master# /usr/lib/nagios/plugins/check_service.sh -s postfix -o linux
   Active: inactive (dead) since Thu 2018-05-24 12:46:25 BST; 8s ago
root@vps3:~/icinga/nagios-plugins-master# echo $?
0

I think this incorrect, and the bug is due to the grep on line 79:

SERVICETOOL="systemctl status $SERVICE | grep -i Active"

I can fix this by removing the grep on this line as follows:

SERVICETOOL="systemctl status $SERVICE"

which means that SERVICETOOL will get the result variable from systemctl itself rather than the return from grep (which is matching on the text 'ACTIVE' and 'inactive' in the status line therefore returning zero).
Unfortunately the status text returned without this grep is then a bit verbose but at least the status code (i.e. critical, okay etc) is correct. Maybe the grep for '-i Active' could be moved to later in the script once the actual status code check has been performed.

OS is Ubuntu Xenial (16.04) Last update in the header of the script is 2018-04-25.

Thanks for the script. Andy.

ajm83 avatar May 24 '18 12:05 ajm83

Yes!

Two ways to work around this:

  1. manually specify tool to use with "systemctl -t is-active"
$ check_service.sh -t "systemctl is-active mpd-cd"; echo $?
inactive
2

This works, but you only get "active" or "inactive" as a status

  1. tweak the script to return exit status of first command in the pipeline (this assumes you are using bash, but first line of script is #!/usr/bin/env bash, so a reasonable assumption?).

replace SERVICETOOL="systemctl status $SERVICE | grep -i Active"

with SERVICETOOL='systemctl status $SERVICE | grep "^ *Active:" ; ( exit ${PIPESTATUS[0]} )'

Note, this also fixes another potential problem with the grep for "Active" - refer comment at [https://github.com/jonschipp/nagios-plugins/commit/939de4b432aa4f1eaa90671f8905c1595639fbdb#comments]

$ check_service_tweaked -s mpd-cd; echo $?
   Active: inactive (dead) since Sun 2018-07-22 09:20:04 AEST; 3h 57min ago
2

This way you still get the nice one line summary

thebream avatar Jul 22 '18 03:07 thebream

Actually, You can simply add this to the top of the script. set -o pipefail

The issue is that the successful grep -i Active is masking the exit code of systemctl status $SERVICE on line 79

jlevy8 avatar Aug 05 '19 13:08 jlevy8

should we add "set -o pipefail" to the top of the script to fix the Active - Failed status being reported as OK instead of Critical. Maybe a new PR for this?

gardart avatar Sep 11 '20 09:09 gardart