OvenMediaEngine icon indicating copy to clipboard operation
OvenMediaEngine copied to clipboard

OME docker container doesn't respond to SIGTERM

Open d-uzlov opened this issue 1 year ago • 18 comments

Describe the bug

SIGTERM is the standard way for a graceful program termination. The official OME container doesn't seem to respond to it.

It does seem to correctly shut down on SIGABRT (well, at least there is a log entry, so I guess it must be graceful shutdown and not a default handler) but docker and k8s both use SIGTERM as the default shutdown signal.

To Reproduce

$ docker run --name ome -d docker.io/airensoft/ovenmediaengine:0.16.4
2c39a0d3f324ca1f124c5ef4997b96fd21daa821d58179f8b43f773abb1aa733
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker kill --signal SIGTERM ome
ome
$ docker ps
CONTAINER ID   IMAGE                              COMMAND                  CREATED          STATUS          PORTS
                              NAMES
2c39a0d3f324   airensoft/ovenmediaengine:0.16.4   "/opt/ovenmediaengin…"   59 seconds ago   Up 59 seconds   80/tcp, 1935/tcp, 3333-3334/tcp, 8080/tcp, 8090/tcp, 4000-4005/udp, 9000/tcp, 10000-10010/udp   ome
$ time docker stop ome
ome

real    0m10.306s
user    0m0.017s
sys     0m0.028s

As you can see, docker kill --signal SIGTERM ome did nothing, and docker stop ome hanged until 10s timeout.

Expected behavior

Container performs graceful shutdown on SIGTERM.

Server:

  • OvenMediaEngine Version: docker.io/airensoft/ovenmediaengine:0.16.4

Additional context

I don't have a server with non-docker OME available, so I can't check if this is a docker-specific issue. If it is, then my guess is that it may be related to linux disabling default signal handlers for PID 1.

Also, as a sanity check, here is a similar test with nginx:

$ docker run --name nginx -d nginx
4e4856903f1ad2cb964a0c55c2f1c3d8b3b5f38381a084488ade5512befcd2c6
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker kill --signal SIGTERM nginx
nginx
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ docker ps --all
CONTAINER ID   IMAGE     COMMAND                  CREATED          STATUS                     PORTS     NAMES
9cef57b37d5f   nginx     "/docker-entrypoint.…"   12 seconds ago   Exited (0) 6 seconds ago             nginx
$ docker rm nginx
nginx
$ docker run --name nginx -d nginx
c11b540f96b1bcaeca21414b75bb4b30fc1af011e3d713ff76642ec37f940543
danil@danil-main:/mnt/c/Users/danil/Documents/k8s-public-copy
$ time docker stop nginx
nginx

real    0m0.481s
user    0m0.020s
sys     0m0.020s

d-uzlov avatar Feb 20 '24 05:02 d-uzlov

For those who use docker-compose, put this into your docker-compose.yml as a workaround: stop_signal: KILL (for an example, see https://github.com/basisbit/KitchenCDN/blob/main/docker-compose.yml )

basisbit avatar Feb 20 '24 16:02 basisbit

Is this why a docker-compose down takes so long with OME? I figured it was cleaning up, but maybe it's timing out and is forcibly killed?

naanlizard avatar Feb 20 '24 17:02 naanlizard

@naanlizard yes, it is. By default Docker will wait 10 seconds and then kill the service.

basisbit avatar Feb 20 '24 20:02 basisbit

This was something we overlooked. I've updated the Dockerfile to send SIGKILL instead of SIGTERM. The Docker image built next will stop immediately. 👍

https://github.com/AirenSoft/OvenMediaEngine/commit/f53d4bf9925dffc3c7a186736d09023770d10a2d

dimiden avatar Feb 21 '24 04:02 dimiden

This was something we overlooked. I've updated the Dockerfile to send SIGKILL instead of SIGTERM. The Docker image built next will stop immediately. 👍

SIGKILL doesn't allow for any graceful shutdown, though. I imagine it can be very bad for stream recording, for example.

Does OME support graceful shutdown? If yes, then I think a better solution would be to properly support SIGTERM.

d-uzlov avatar Feb 21 '24 10:02 d-uzlov

I am also rather concerned about recordings and would like to know more

naanlizard avatar Feb 21 '24 12:02 naanlizard

@d-uzlov Yes, you're correct. As you mentioned, it's preferable to support SIGTERM over SIGKILL. However, I'm currently unable to support SIGTERM immediately, so I've opted to handle it with SIGKILL for now.

To elaborate further, during the initial development of the OME, I had implemented graceful shutdown upon receiving the SIGINT for testing purposes. However, due to some issues, I'm currently unable to support SIGTERM. As I don't have the time to address the issues related to graceful shutdown and add support for SIGTERM immediately, I've decided to handle it with SIGKILL for the time being.

If I have more time to focus on OME, I'll support SIGTERM :)

dimiden avatar Feb 21 '24 13:02 dimiden

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 22 '24 01:04 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 21 '24 18:06 stale[bot]

Sorry to hop in a random thread here, but will there be a release soon? We are waiting to deploy that crash fix and it sounds like there are a few other fixes (like this one) that would be nice

naanlizard avatar Jun 28 '24 11:06 naanlizard