consul-template
consul-template copied to clipboard
[windows]: Kill failed: `"not supported by windows"`
Consul Template version
2022-07-22T06:56:28.852Z [INFO] consul-template v0.29.1 (4525703)
Configuration
log_level = "trace"
# Setting this value to the empty string will cause consul-template
# to not listen for any reload signals but pass them down to nginx instead.
reload_signal = ""
consul {
retry {
enabled = true
attempts = 10
max_backoff = "1s"
}
}
vault {
retry {
enabled = true
attempts = 10
max_backoff = "1s"
}
}
template {
destination = "secrets/ssl/keys/connect-ca.pem"
contents = <<EOC
{{ range caRoots }}{{ .RootCertPEM }}{{ end }}
EOC
}
template {
destination = "secrets/ssl/keys/connect-client.pem"
contents = <<EOC
{{ with caLeaf "connect-ingress" }}{{ .CertPEM }}{{ end }}
EOC
}
template {
destination = "secrets/ssl/keys/connect-client.key"
contents = <<EOC
{{ with caLeaf "connect-ingress" }}{{ .PrivateKeyPEM }}{{ end }}
EOC
}
template {
source = "templates/certificates.hcl.ctmpl"
destination = "secrets/certificates.hcl"
exec {
command = ["consul-template", "-once", "-config", "secrets/certificates.hcl"]
timeout = "30s"
}
}
template {
source = "templates/ingress.conf.ctmpl"
destination = "local/nginx/stream.d/ingress.conf"
}
template {
source = "templates/ingress-upstreams.conf.ctmpl"
destination = "local/nginx/conf.d/ingress-upstreams.conf"
}
template {
source = "templates/http-status.conf.ctmpl"
destination = "local/nginx/conf.d/http-status.conf"
}
exec {
# We are setting the kill_timeout to the max 5m that Nomad is going to wait before killing consul-template itself
# This way we allow Nomad to be the one controlling the kill_timeout and not consul-template
kill_timeout = "5m"
# This is a random splay to wait before killing the command. The default
# value is 0 (no wait), but large clusters should consider setting a splay
# value to prevent all child processes from reloading at the same time when
# data changes occur. When this value is set to non-zero, Consul Template
# will wait a random period of time up to the splay value before reloading
# or killing the child process. This can be used to prevent the thundering
# herd problem on applications that do not gracefully reload.
splay = "30s"
# It is very important to use the "array" syntax when defining parameters to the command,
# otherwise consul-template does not pass signals correctly to the spawned process.
# See: https://github.com/hashicorp/consul-template#multiple-commands
command = ["nginx"]
}
Command
CMD ["consul-template", "-config", "C:/config.hcl"]
Debug output
https://gist.github.com/kirooshu/886104e80c8b6c185a803370777803b6
Expected behavior
In this case I would have expected consul-template to be able to terminate the nginx
process and restart.
Actual behavior
2022-07-22T17:11:11.166Z [ERR] (child) Kill failed: OpenProcess: The parameter is incorrect.
Hey @kirooshu, thanks for the report.
This has to come from one of the 2 calls in this block. The setpgid
bit is the only change in recent versions that might have triggered this issue.
https://github.com/hashicorp/consul-template/blob/ae9402466397618b825858e75a00ebb04587f228/child/child.go#L389-L399
This is probably fixed on HEAD as I've already changed this on the main
branch to only use setpgid
for a specific kind of sub-process that you won't have in a Windows environment.
TLDR; I think this is already fixed on the main branch. Can you build it yourself to test? If not I could probably get you a binary of it.
Thanks.
Thank you. For us, https://github.com/hashicorp/consul-template/issues/1428 is more important to actually make sure consul-template does not even kill the nginx process. If you get me a windows build of latest, I can definitely try it and confirm that it is fixed in practice. I can also test when you release the new version as well, unless you think that is too late to confirm with a release?
I'm working on a 0.29.2 bugfix release now (unrelated to this, but forcing a priority change) and will be releasing that in the next few weeks. I'm going to go with my guess that this is fixed already by the other changes and resolve this as part of that release. If for some reason it is still around we can always reopen this. Thanks.
Fixed by #1600
Hello @eikenb, unfortunately the error now switched to this:
2022-09-29T21:07:22.887Z [INFO] (child) reloading process
2022-09-29T21:07:37.327Z [ERR] (cli) 1 error occurred:
* not supported by windows
I even tried to force to a known signal, same problem:
CMD ["consul-template", "-config", "C:/config.hcl", "-exec-kill-signal", "SIGTERM", "-exec-reload-signal", "SIGTERM"]
Any ideas?
Hey @kirooshu... I'm not sure what's up with that. Windows supports SIGTERM and I can't find any issues eyeballing the code. Any chance of a conflicting setting in that config.hcl file?
This is the config on disk:
C:\>type config.hcl
log_level = "info"
# Setting this value to the empty string will cause consul-template
# to not listen for any reload signals but pass them down to nginx instead.
reload_signal = ""
consul {
retry {
enabled = true
attempts = 10
max_backoff = "1s"
}
}
vault {
retry {
enabled = true
attempts = 10
max_backoff = "1s"
}
}
template {
destination = "secrets/ssl/keys/connect-ca.pem"
contents = <<EOC
{{ range caRoots }}{{ .RootCertPEM }}{{ end }}
EOC
}
template {
destination = "secrets/ssl/keys/connect-client.pem"
contents = <<EOC
{{ with caLeaf "connect-ingress" }}{{ .CertPEM }}{{ end }}
EOC
}
template {
destination = "secrets/ssl/keys/connect-client.key"
contents = <<EOC
{{ with caLeaf "connect-ingress" }}{{ .PrivateKeyPEM }}{{ end }}
EOC
}
template {
source = "templates/certificates.hcl.ctmpl"
destination = "secrets/certificates.hcl"
exec {
command = ["consul-template", "-once", "-config", "secrets/certificates.hc
l"]
timeout = "30s"
}
}
template {
source = "templates/http-status.conf.ctmpl"
destination = "local/nginx/conf.d/http-status.conf"
}
template {
source = "templates/ingress.conf.ctmpl"
destination = "local/nginx/stream.d/ingress.conf"
}
template {
source = "templates/ingress-upstreams.conf.ctmpl"
destination = "local/nginx/conf.d/ingress-upstreams.conf"
}
exec {
# We are setting the kill_timeout to the max 5m that Nomad is going to wait
before killing consul-template itself
# This way we allow Nomad to be the one controlling the kill_timeout and not
consul-template
kill_timeout = "5m"
# data changes occur. When this value is set to non-zero, Consul Template
# will wait a random period of time up to the splay value before reloading
# or killing the child process. This can be used to prevent the thundering
# herd problem on applications that do not gracefully reload.
splay = "30s"
# It is very important to use the "array" syntax when defining parameters to the command,
# otherwise consul-template does not pass signals correctly to the spawned process.
# See: https://github.com/hashicorp/consul-template#multiple-commands
command = ["nginx"]
}
and confirming this is what is in the Dockerfile: CMD ["consul-template", "-config", "C:/config.hcl", "-exec-kill-signal", "SIGTERM", "-exec-reload-signal", "SIGTERM"]
We run OpenResty 1.21.x on a nanoserver-windows-1809
I tried to use windows:servercore-ltsc2019
as a base image and the problem was the same.
Forgot to reopen... doing that and updating the title.
We would still love to find a solution here if possible!