OvenMediaEngine
OvenMediaEngine copied to clipboard
OME docker container doesn't respond to SIGTERM
Describe the bug
SIGTERM is the standard way for a graceful program termination. The official OME container doesn't seem to respond to it.
It does seem to correctly shut down on SIGABRT (well, at least there is a log entry, so I guess it must be graceful shutdown and not a default handler) but docker and k8s both use SIGTERM as the default shutdown signal.
To Reproduce
$ docker run --name ome -d docker.io/airensoft/ovenmediaengine:0.16.4
2c39a0d3f324ca1f124c5ef4997b96fd21daa821d58179f8b43f773abb1aa733
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker kill --signal SIGTERM ome
ome
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
NAMES
2c39a0d3f324 airensoft/ovenmediaengine:0.16.4 "/opt/ovenmediaengin…" 59 seconds ago Up 59 seconds 80/tcp, 1935/tcp, 3333-3334/tcp, 8080/tcp, 8090/tcp, 4000-4005/udp, 9000/tcp, 10000-10010/udp ome
$ time docker stop ome
ome
real 0m10.306s
user 0m0.017s
sys 0m0.028s
As you can see, docker kill --signal SIGTERM ome
did nothing, and docker stop ome
hanged until 10s timeout.
Expected behavior
Container performs graceful shutdown on SIGTERM.
Server:
- OvenMediaEngine Version:
docker.io/airensoft/ovenmediaengine:0.16.4
Additional context
I don't have a server with non-docker OME available, so I can't check if this is a docker-specific issue. If it is, then my guess is that it may be related to linux disabling default signal handlers for PID 1.
Also, as a sanity check, here is a similar test with nginx:
$ docker run --name nginx -d nginx
4e4856903f1ad2cb964a0c55c2f1c3d8b3b5f38381a084488ade5512befcd2c6
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker kill --signal SIGTERM nginx
nginx
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9cef57b37d5f nginx "/docker-entrypoint.…" 12 seconds ago Exited (0) 6 seconds ago nginx
$ docker rm nginx
nginx
$ docker run --name nginx -d nginx
c11b540f96b1bcaeca21414b75bb4b30fc1af011e3d713ff76642ec37f940543
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ time docker stop nginx
nginx
real 0m0.481s
user 0m0.020s
sys 0m0.020s
For those who use docker-compose, put this into your docker-compose.yml as a workaround:
stop_signal: KILL
(for an example, see https://github.com/basisbit/KitchenCDN/blob/main/docker-compose.yml )
Is this why a docker-compose down takes so long with OME? I figured it was cleaning up, but maybe it's timing out and is forcibly killed?
@naanlizard yes, it is. By default Docker will wait 10 seconds and then kill the service.
This was something we overlooked. I've updated the Dockerfile
to send SIGKILL
instead of SIGTERM
.
The Docker image built next will stop immediately. 👍
https://github.com/AirenSoft/OvenMediaEngine/commit/f53d4bf9925dffc3c7a186736d09023770d10a2d
This was something we overlooked. I've updated the
Dockerfile
to sendSIGKILL
instead ofSIGTERM
. The Docker image built next will stop immediately. 👍
SIGKILL doesn't allow for any graceful shutdown, though. I imagine it can be very bad for stream recording, for example.
Does OME support graceful shutdown? If yes, then I think a better solution would be to properly support SIGTERM.
I am also rather concerned about recordings and would like to know more
@d-uzlov
Yes, you're correct. As you mentioned, it's preferable to support SIGTERM
over SIGKILL
. However, I'm currently unable to support SIGTERM
immediately, so I've opted to handle it with SIGKILL
for now.
To elaborate further, during the initial development of the OME, I had implemented graceful shutdown upon receiving the SIGINT
for testing purposes. However, due to some issues, I'm currently unable to support SIGTERM
. As I don't have the time to address the issues related to graceful shutdown and add support for SIGTERM
immediately, I've decided to handle it with SIGKILL
for the time being.
If I have more time to focus on OME, I'll support SIGTERM
:)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Sorry to hop in a random thread here, but will there be a release soon? We are waiting to deploy that crash fix and it sounds like there are a few other fixes (like this one) that would be nice