nagios-plugins
nagios-plugins copied to clipboard
Inactive service reported as active
When a service is down, the plugin is still returning status code of OKAY (zero).
root@vps3:~/icinga/nagios-plugins-master# /usr/lib/nagios/plugins/check_service.sh -s postfix -o linux
Active: inactive (dead) since Thu 2018-05-24 12:46:25 BST; 8s ago
root@vps3:~/icinga/nagios-plugins-master# echo $?
0
I think this incorrect, and the bug is due to the grep on line 79:
SERVICETOOL="systemctl status $SERVICE | grep -i Active"
I can fix this by removing the grep on this line as follows:
SERVICETOOL="systemctl status $SERVICE"
which means that SERVICETOOL will get the result variable from systemctl itself rather than the return from grep (which is matching on the text 'ACTIVE' and 'inactive' in the status line therefore returning zero).
Unfortunately the status text returned without this grep is then a bit verbose but at least the status code (i.e. critical, okay etc) is correct. Maybe the grep for '-i Active' could be moved to later in the script once the actual status code check has been performed.
OS is Ubuntu Xenial (16.04) Last update in the header of the script is 2018-04-25.
Thanks for the script. Andy.
Yes!
Two ways to work around this:
- manually specify tool to use with "systemctl -t is-active"
$ check_service.sh -t "systemctl is-active mpd-cd"; echo $?
inactive
2
This works, but you only get "active" or "inactive" as a status
- tweak the script to return exit status of first command in the pipeline (this assumes you are using bash, but first line of script is #!/usr/bin/env bash, so a reasonable assumption?).
replace
SERVICETOOL="systemctl status $SERVICE | grep -i Active"
with
SERVICETOOL='systemctl status $SERVICE | grep "^ *Active:" ; ( exit ${PIPESTATUS[0]} )'
Note, this also fixes another potential problem with the grep for "Active" - refer comment at [https://github.com/jonschipp/nagios-plugins/commit/939de4b432aa4f1eaa90671f8905c1595639fbdb#comments]
$ check_service_tweaked -s mpd-cd; echo $?
Active: inactive (dead) since Sun 2018-07-22 09:20:04 AEST; 3h 57min ago
2
This way you still get the nice one line summary
Actually, You can simply add this to the top of the script.
set -o pipefail
The issue is that the successful grep -i Active
is masking the exit code of systemctl status $SERVICE
on line 79
should we add "set -o pipefail" to the top of the script to fix the Active - Failed status being reported as OK instead of Critical. Maybe a new PR for this?