consul-template icon indicating copy to clipboard operation
consul-template copied to clipboard

Non-error exit from exec'd child process is logged as an error

Open freddygv opened this issue 5 years ago • 2 comments

Background: https://discuss.hashicorp.com/t/why-0-exit-code-of-exec-command-is-an-error/2904

Currently, when a child process dies consul-template uses the Runner's ErrorCh to propagate that information: https://github.com/hashicorp/consul-template/blob/2734e5125874e6debb60765cf0e2ef1a65ee7737/manager/runner.go#L318

Because the message was passed into the ErrCh, when the CLI receives that message it gets logged as an error, even if the status code was 0:

[ERR] (cli) child process died with exit code 0

Consul-template's exit code will correctly reflect the exit status of the child, but the logging shows any child exit as an ERR. This can be confusing to users who do not expect to see a 0 status code as an error.

Here are two trivial examples to see the handling of the child exit code:

  • consul-template -exec true: logged as error, consul-template exits with 0
  • consul-template -exec false: logged as error, consul-template exits with 1

A couple ways this could be handled:

  • If the child exit code is 0, send a message into the runner's DoneCh rather than the ErrCh
  • Change logError in the CLI so that the logging (ERR/INFO) is based on the status code that was provided to it.

freddygv avatar Sep 19 '19 14:09 freddygv

The exec feature was added to allow for managing a long running process. Usually a service of some sort. In this way if they child process exits for any reason it is seen as an error as it is not supposed to exit.

The use case here is "backup jobs" that obviously exit.

The exit code of the consul-template process does match the exit code of the child process. The log entry matches this even if it seems wrong that it is on the error channel.

I believe I've talked myself into thinking that changing the log-level of the log entry is the correct fix. Changing it to INFO in the case of a successful exit of the child process.

eikenb avatar Sep 19 '19 20:09 eikenb

Our services that run in docker containers use docker-entrypoint.sh like this:

exec consul-template \
  -consul-addr "http://consul.service.consul:8500" \
  -template "/templates/some.ctmpl:some.conf" \
  -exec "service-executable"

we use exec consul-template for it to receive SIGTERM (as it is generated by docker stop) and pass it to our service for graceful shutdown. So stopping a long running service which exited with zero exit code is not an error.

AOne-T avatar Oct 15 '19 14:10 AOne-T

@eikenb I have a compromise here :)

The use case we have is the same as @AOne-T but we use the -once flag which obviously means that we kind of expect the script to exit after the templates are generated (because we want to reload another process, in our case).

Do you think this is a good compromise in removing the [ERR] log in that case?

komapa avatar Sep 29 '22 19:09 komapa

I skipped this before as it was so easy I thought it'd make a good first issue for someone who wanted to contribute, but it's been awhile and this is really an easy fix... so I'm just going to do it and take the return code into account and use INFO for 0/success and ERR for >0/error.

PR is already up... #1649

eikenb avatar Sep 29 '22 23:09 eikenb

Thank you so much!

komapa avatar Oct 18 '22 17:10 komapa