consul-template
consul-template copied to clipboard
Non-error exit from exec'd child process is logged as an error
Background: https://discuss.hashicorp.com/t/why-0-exit-code-of-exec-command-is-an-error/2904
Currently, when a child process dies consul-template uses the Runner's ErrorCh to propagate that information: https://github.com/hashicorp/consul-template/blob/2734e5125874e6debb60765cf0e2ef1a65ee7737/manager/runner.go#L318
Because the message was passed into the ErrCh, when the CLI receives that message it gets logged as an error, even if the status code was 0:
[ERR] (cli) child process died with exit code 0
Consul-template's exit code will correctly reflect the exit status of the child, but the logging shows any child exit as an ERR. This can be confusing to users who do not expect to see a 0 status code as an error.
Here are two trivial examples to see the handling of the child exit code:
-
consul-template -exec true
: logged as error, consul-template exits with 0 -
consul-template -exec false
: logged as error, consul-template exits with 1
A couple ways this could be handled:
- If the child exit code is 0, send a message into the runner's
DoneCh
rather than theErrCh
- Change logError in the CLI so that the logging (ERR/INFO) is based on the status code that was provided to it.
The exec feature was added to allow for managing a long running process. Usually a service of some sort. In this way if they child process exits for any reason it is seen as an error as it is not supposed to exit.
The use case here is "backup jobs" that obviously exit.
The exit code of the consul-template process does match the exit code of the child process. The log entry matches this even if it seems wrong that it is on the error channel.
I believe I've talked myself into thinking that changing the log-level of the log entry is the correct fix. Changing it to INFO in the case of a successful exit of the child process.
Our services that run in docker containers use docker-entrypoint.sh like this:
exec consul-template \
-consul-addr "http://consul.service.consul:8500" \
-template "/templates/some.ctmpl:some.conf" \
-exec "service-executable"
we use exec consul-template
for it to receive SIGTERM (as it is generated by docker stop
) and pass it to our service for graceful shutdown. So stopping a long running service which exited with zero exit code is not an error.
@eikenb I have a compromise here :)
The use case we have is the same as @AOne-T but we use the -once
flag which obviously means that we kind of expect the script to exit after the templates are generated (because we want to reload another process, in our case).
Do you think this is a good compromise in removing the [ERR]
log in that case?
I skipped this before as it was so easy I thought it'd make a good first issue for someone who wanted to contribute, but it's been awhile and this is really an easy fix... so I'm just going to do it and take the return code into account and use INFO for 0/success and ERR for >0/error.
PR is already up... #1649
Thank you so much!