nagios-plugins
nagios-plugins copied to clipboard
check_service.sh wrong $STATUS_MSG cases
Hi, first of all thanks for your simple to understand nagios checks.
On Debian Jessie with systemctl , I had to remove 2 cases. Because the would respond with a wrong exit code.
These two i had to remove:
*SUCCESS*) *[eE]nable*)
The script will respond with an exit code of 0 for both of the cases, but as you can see below the termination of a service with systemctl stop $service may generate (code=exited, status=0/SUCCESS) and enable* (enabled) means in systemctl that the service should auto start.
active service
● abc.service - Key Management Service Emulator in C
Loaded: loaded (/etc/systemd/system/abc.service; enabled)
Active: active (running) since Tue 2016-11-22 12:54:38 CET; 2s ago
Docs: man:abc(8)
Main PID: 23983 (abc)
CGroup: /system.slice/abc.service
└─23983 /usr/local/bin/abc -l syslog -c1 -M1 -D
dead service
● abc.service - Key Management Service Emulator in C
Loaded: loaded (/etc/systemd/system/abc.service; enabled)
Active: inactive (dead) since Tue 2016-11-22 12:55:45 CET; 2s ago
Docs: man:abc(8)
Process: 23983 ExecStart=/usr/local/bin/abc $abc_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 23983 (code=exited, status=0/SUCCESS)
Best Regards skirschner
Hi, Brilliant little script, thanks for writing it.
I'm seeing a similar issue when checking a service. It's reporting critical when it's actually running (funnily only if the service is restart, not when it starts during boot but pretty sure that's just a difference in output of being restart). The output on my running service is
● monit.service - LSB: service and resource monitoring daemon
Loaded: loaded (/etc/init.d/monit; bad; vendor preset: enabled)
Active: active (running) since Fri 2016-12-30 02:14:27 GMT; 50min ago
Docs: man:systemd-sysv-generator(8)
Process: 23541 ExecStop=/etc/init.d/monit stop (code=exited, status=0/SUCCESS)
Process: 23547 ExecStart=/etc/init.d/monit start (code=exited, status=0/SUCCESS)
Tasks: 2
Memory: 2.4M
CPU: 15.443s
CGroup: /system.slice/monit.service
└─23557 /usr/bin/monit -c /etc/monit/monitrc
It's being returned as critical due to the word 'stop' being detected in the status msg.
However I didn't think it was a good idea adjusting/removing the case checks and instead I update the command used to get the status. In my case it uses
SERVICETOOL="systemctl status $SERVICE"
I've changed it to just look at the Active line
SERVICETOOL="systemctl status $SERVICE | grep Active"
This works fine for me but would need more testing to ensure it doesn't affect anything else. I do have another service on a different server that always reports as OK even if the service is stopped. I'm fairly confident it's going to be fixed by this change too.
Hi ,yes sure this would be better.
I saw a few weeks ago that the systemctl give´s you an options which is called "is-active", but because i work at the moment only with debian systems I wont rewrite the code.
Hey guys, thanks for reporting but I probably won't have time to fix this. Feel free to send a PR and I'll get it merged in.
I also ran into this today - after a reboot, the status check returns CRITICAL. I think probably it's sufficient to check the exit code from systemctl when being on systemd? IMHO, it SHOULD always be sufficient to check the exit code, though I'm well aware that on some ubuntu-versions the exit code would always be 0 from the init-scripts, regardless if the service was up or down.
I ran into this, too. (also on a debian system) I have currently changed the call to check_service to '/usr/local/bin/check_service -o linux -t "systemctl is-active $servicename"' According to the man page of systemctl, "systemctl status" is intended to generate human-readable output, if the output is to be consumed by a script, "systemctl show" should be used, which generates "key=value"-pairs, the most important for this script are probably: LoadState=loaded ActiveState=active SubState=running UnitFileState=enabled (If "is-active" is not enough ...)
The problem with is-active (ref comments above) is that it is not available on old versions of systemd. But maybe it can be used now anyway? Those old versions are probably EOL-systems anyway?
I believe that fixes it (at least in my environment, a recent stock centos 7 installation)