of-watchdog
of-watchdog copied to clipboard
JVM never receives SIGTERM (shutdownhook never called)
Expected Behaviour
I would expect the JVM to receive SIGTERM and then terminate. I am running a HTTP4S scala webserver in this project: https://github.com/hejfelix/fp-exercises-and-grading/blob/master/http4s_faas/openfaas/Dockerfile
Current Behaviour
JVM never shuts down before docker container is killed
Possible Solution
Not sure, it seems like watchdog is not forwarding the shutdown hook?
Steps to Reproduce (for bugs)
Add a shutdown hook to any function running in http mode
Context
I want to be able to clean up resources, e.g. database connections, unfinished operations, etc.
Your Environment
Running on docker swarm on my macbook pro
Client: Docker Engine - Community
Version: 18.09.0-ce-beta1
API version: 1.39
Go version: go1.10.4
Git commit: 78a6bdb
Built: Thu Sep 6 22:41:53 2018
OS/Arch: darwin/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.0-ce-beta1
API version: 1.39 (minimum version 1.12)
Go version: go1.10.3
Git commit: 78a6bdb
Built: Thu Sep 6 22:49:35 2018
OS/Arch: linux/amd64
Experimental: true
Hi thanks for the question. You get a graceful shutdown period built-in. This gives any HTTP requests a grace period to shut down (you should be able to read about this in the readme)
The period is connected to write_timeout which I see you may not be specifying correctly at present. A Golang duration is needed i.e. 20s.
Derek add label: question
What I'm thinking about is a hook whenever a container is taken out of commission (e.g. scale to 0, function removed, etc.). Having a warm JVM means that collecting/releasing resources on each invocation seems a bit silly, so it would be nice to know when the resources MUST be released.
Are you saying it does indeed forward SIGTERM to the fprocess?
Hi @hejfelix I would be curious as to what Lambda, Google Functions or Azure Functions do in this scenario. If they do pass on SIGTERM then we should investigate it.
Alex
Kubernetes has a rather complicated shutdown procedure regarding health-checks which does not make things easy. cc @LucasRoesler @stefanprodan
My experience with AWS Lambda was that there was no shutdown hook. This essentially meant that I wasted a lot of time acquiring and releasing resources on every invocation. This defeats the purpose of having a warm function.
There are a couple of things that happen during shutdown in a k8s pod
- the Pod is marked as terminated/deleted, this also removes it from various Endpoints and Services in k8s
- the
preStopHookis executed, this allows the Pod to respond to and customize the stop behavior. But ... this is not something we support configuration of SIGTERMis sent to the Pod- when the grace period expires,
SIGKILLis sent. The grace period defaults to 30 seconds and can be customized on the Pod spec viaterminationGracePeriodSeconds
By far the simplest thing we could do is to listen for and send the SIGTERM to the child process. Which should already be happening (this is the relevant chunk of code that sends the TERM to the fprocess in http mode)
In short, a quick review of the code looks like it is trying to send the TERM signal to the fprocess, in this case the JVM.
Interesting -- I am running on docker swarm in all my tests so far. Is that supposed to behave the same way? I guess I'm struggling to find a way to verify that my shutdown hook is run since nothing appears in docker service logs after node is taken down. Any ideas?
Probably the simplest way to verify it is to send a message somewhere, e.g. to RequestBin . If you are certain that your shutdown hook runs on SIGTERM and If you don't get a message, then that means you didn't get a SIGTERM
Right, so I'm not getting any shutdown hook. This is my scala code:
Runtime.getRuntime().addShutdownHook(new Thread(){
override def run(): Unit = Source.fromURL(s"http://requestbin.fullcontact.com/wrh63owr/SHUTDOWN_HOOK_SUCK_IT_LAMBDA_${Random.nextInt}").toList
} )
If I run this line e.g. in the REPL, it works:
Source.fromURL(s"http://requestbin.fullcontact.com/wrh63owr/SHUTDOWN_HOOK_SUCK_IT_LAMBDA_${Random.nextInt}").toList
Im starting to believe the problem occurs because my web framework is not reacting to sigterm. Will investigate now and return.