node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Node exporter always return 143 when receiving a SIGTERM

Open rkachach opened this issue 2 years ago • 11 comments

Host operating system: output of uname -a

Linux fedora 5.15.14-200.fc35.x86_64 #1 SMP Tue Jan 11 16:49:27 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

latest version (as run by Docker) quay.io/prometheus/node-exporter:latest

node_exporter command line flags

docker run -d --net="host" --pid="host" -v "/:/host:ro,rslave" 'quay.io/prometheus/node-exporter:latest' --path.rootfs=/host

Are you running node_exporter in Docker?

What did you do that produced an error?

docker stop <container_id>

What did you expect to see?

exit value code 0

What did you see instead?

Exit value code is always 143. It seems like node-exporter is not handling correctly the SIGTERM signal (sent when docker stop). The code value (143) is generated normally when a SIGTERM is sent by the underlying operating system. Hence we are stopping the container properly (using docker stop) the code value should be zero. Same happens when podman is used instead of docker.

rkachach avatar Apr 25 '22 10:04 rkachach

As far as I can tell it is quite common for programs to exit with 143. Even sleep will exit with 143 when TERM'd. Do you have an example of something that exits with 0 in this situation? Can you elaborate on what problem this is causing?

ventifus avatar Jun 08 '22 16:06 ventifus

@ventifus thanks for taking time to look into this. Basically, the node-exporter is the only process that when stopping its container exists with 143 instead of zero. All the other prometheus daemons when stopped (by using docker stop) finish and return a 0 code. In our env, since we are running the container in systemd, the service ends up in error state even when stopped gracefully.

rkachach avatar Jun 14 '22 13:06 rkachach

Hrm I'm not 100% what the right approach is but if all other prometheus components behave like that, we should probably too. @SuperQ wdyt?

discordianfish avatar Jul 05 '22 17:07 discordianfish

Any update on this? thanks.

rkachach avatar Sep 02 '22 09:09 rkachach

I wonder if this is something we should add to the exporter-toolkit.

SuperQ avatar Sep 02 '22 10:09 SuperQ

any update on this?

rkachach avatar Jul 03 '23 11:07 rkachach

@rkachach No, feel free to implement this.

discordianfish avatar Jul 03 '23 12:07 discordianfish

@rkachach No, feel free to implement this.

I'll be more than happy to contribute but I'm not familiar with the node-exporter code base nor what would be the best solution, so in practice I can't fix it at this moment (otherwise I would have already done it).

rkachach avatar Jul 03 '23 12:07 rkachach

I am not very sure of the best way to handle this, but perhaps adding a signal handling to call server.Shutdown is enough? FYI: https://gin-gonic.com/docs/examples/graceful-restart-or-stop/

Bogay avatar Jul 07 '23 18:07 Bogay

Its easy: Trap the signal, exit with 0 in the handler. But as @SuperQ said, adding this to the exporter-toolkit would be best

discordianfish avatar Jul 10 '23 10:07 discordianfish

@discordianfish Do you mean calling os.Exit(0) in the signal handler?

Bogay avatar Jul 10 '23 10:07 Bogay