tilt-extensions `restart_process` restart-helper leaks zombie processes

root@api-5dd8c8769-nwdv7:/# ps aux wwwf
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          49  0.0  0.0   4188  3332 pts/0    Ss   14:30   0:00 bash
root          80  0.0  0.0   8088  3940 pts/0    R+   14:31   0:00  \_ ps aux wwwf
root           1  0.0  0.0 703588  1664 ?        Ssl  05:37   0:00 /tilt-restart-wrapper --watch_file=/tmp/.restart-proc sh -c /api-server
root          11  0.0  0.0   1220   104 ?        S    05:37   0:00 /entr -rz sh -c /api-server
root          70  0.0  0.0   2576   860 ?        S    14:30   0:00  \_ sh -c /api-server
root          71  0.1  0.4 1287044 72528 ?       Sl   14:30   0:00      \_ /api-server
root          13  0.0  0.0      0     0 ?        Z    05:37   0:02 [api-server] <defunct>
root          38  0.0  0.0      0     0 ?        Z    14:29   0:00 [api-server] <defunct>

Seems like it might need a minimal init (tini, dumb-init, etc).

I noticed this because entr also doesn't reliably kill the process before starting a new one... perhaps this is because of sh, I'm not sure. That's a different issue.

Apr 12 '24 14:04 glennpratt

hmmm...i was not able to reproduce this problem. here are the steps i tried:

cd into tilt-extensions/restart_process/test
Run tilt up -f custom_deploy.Tiltfile
Clicked test_update a bunch of times
Exec'd into the pod and ran ps aux wwwf
Saw that there was exactly one start.sh, as i expected.

when i poked around in the entr repo, i found this issue - https://github.com/eradman/entr/pull/38 - which seems to indicate to me that it's at least expected for entr to kill the process

Apr 16 '24 19:04 nicks

I'm running into a sort of similar situation where a process can be left around hanging as a zombie process.

I tried to create a reproducable testcase and came up with something which creates a zombie process. This is not supposed to be a useful or optimized usecase for anything, just commands put together to create a zombie process.

diff --git a/restart_process/test/Dockerfile.test b/restart_process/test/Dockerfile.test
index cfbeb9f..89e11d3 100644
--- a/restart_process/test/Dockerfile.test
+++ b/restart_process/test/Dockerfile.test
@@ -1,5 +1,7 @@
 FROM alpine
 
+RUN apk add nginx
+
 RUN echo 0 > restart_count.txt
 
 ADD start.sh /
diff --git a/restart_process/test/start.sh b/restart_process/test/start.sh
index c9bc981..6277158 100755
--- a/restart_process/test/start.sh
+++ b/restart_process/test/start.sh
@@ -12,8 +12,13 @@ handle_sigterm() {
 
 trap handle_sigterm SIGTERM
 
-while true
-do
-  echo running
-  sleep 5
+while pgrep -f nginx >/dev/null ; do
+  echo "waiting for nginx to shut down"
+  set +e
+  pkill -f nginx
+  set -e
+  sleep 3
 done
+nginx
+
+tail -F /var/log/nginx/*

Run tilt up -f custom_deploy.Tiltfile
The restart should happen right away with the first start.
The new run of the entrypoint script is left waiting for the previous instance of nginx to shut down, since it will remain there as a zombie process because tilt-restart-wrapper as pid 1 does not call waitpid on it.

Going into the container and running ps faux:

ps faux
PID   USER     TIME  COMMAND
    1 root      0:00 /tilt-restart-wrapper --watch_file=/tmp/.restart-proc sh -c /start.sh
   17 root      0:00 /entr -rz sh -c /start.sh
   25 root      0:00 [nginx]
   62 root      0:00 {start.sh} /bin/sh /start.sh
   78 root      0:00 sleep 3
   79 root      0:00 sh
   85 root      0:00 ps faux

The pid 25 is there as a zombie.

I assume the issue is tilt-restart-wrapper not reaping zombies, which should be a job of a process running as PID 1.

Jul 04 '24 10:07 ghost

Thanks for the reproduction @samuliy. Until this is fixed, there is a simple workaround if this doesn't cause issues with your workload or sidecars:

      {{- if .Values.tilt }}
      # Fix zombie processes not stopping under Tilt
      shareProcessNamespace: true
      {{- end }}

Oct 01 '24 21:10 glennpratt

@nicks can you remove the needs repro case label?

Oct 01 '24 21:10 glennpratt

tilt-extensions tilt-extensions copied to clipboard

`restart_process` restart-helper leaks zombie processes

tilt-extensions
tilt-extensions copied to clipboard