deno icon indicating copy to clipboard operation
deno copied to clipboard

Zombie subprocesses

Open asantos00 opened this issue 5 years ago • 10 comments

Currently, when creating a subprocess using Deno.run, the process stays in "zombie mode" even after it is killed.

I've created a gist that reproduces this bug, and talked with @lucacasonato that confirmed it.

https://gist.github.com/asantos00/bdd4357cb91f18b368085eb44105ea97

asantos00 avatar Aug 17 '20 17:08 asantos00

Example from the gist

const command = ['deno', 'run', './script.ts'];

const p1 = Deno.run({ cmd: command })
const p2 = Deno.run({ cmd: command })
const p3 = Deno.run({ cmd: command })
const p4 = Deno.run({ cmd: command })

p1.kill(Deno.Signal.SIGKILL)
p2.kill(Deno.Signal.SIGKILL)
p3.kill(Deno.Signal.SIGKILL)
p4.kill(Deno.Signal.SIGKILL)

setInterval(() => {
  // just to keep it running so you can `ps aux | grep deno` to check the zombie processes
}, 1000);

caspervonb avatar Aug 17 '20 22:08 caspervonb

For background: UNIX expects you to reap zombies with waitpid(2) when you receive a SIGCHLD signal.

This would normally be an easy fix - set the SIGCHLD signal handler to SIG_IGN and let the kernel handle it - except....

tokio::process::Child installs its own SIGCHLD handler that records events into a queue. There is nothing polling the command objects and so events stay in the queue and zombies aren't reaped.

A crude hack that papers over the issue:

diff --git a/cli/ops/process.rs b/cli/ops/process.rs
index 60a6d5095..b259d9531 100644
--- a/cli/ops/process.rs
+++ b/cli/ops/process.rs
@@ -114,6 +114,13 @@ fn op_run(
 
   // Spawn the command.
   let mut child = c.spawn()?;
+
+  #[cfg(unix)]
+  unsafe { libc::signal(libc::SIGCHLD, libc::SIG_IGN); } // Reset signal handler.
+
   let pid = child.id();
 
   let stdin_rid = match child.stdin.take() {

It's not a good fix however because of the race window between the spawn and signal calls.

So far I haven't been able to come up with a better idea than poll child objects in the resource table at a fixed interval.

bnoordhuis avatar Nov 30 '20 22:11 bnoordhuis

I can reproduce this in latest canary. Linux reports all the child processes as zombies

Soremwar avatar Dec 16 '20 18:12 Soremwar

This is definitely still an issue in latest deno release 1.7.0 and it's a big problem for apps that constantly spawn a lot of child processes. We did end up with non-responsive servers after a huge amount of zombie processes were leaked.

Is there any workaround we can do temporarily for now at the application layer before it's fixed in deno runtime, other than letting the deno parent process dies when any child process is killed (which is of course not ideal for long running servers)?

nktpro avatar Jan 21 '21 19:01 nktpro

This issue still happens on v1.7.1

muzuiget avatar Feb 01 '21 11:02 muzuiget

This still seems to be an issue on 1.9.2.

satyarohith avatar Apr 28 '21 12:04 satyarohith

Happens when i run firefox as a subprocess on windows

ebebbington avatar Jul 16 '21 11:07 ebebbington

Any hope on getting this addressed?

ebebbington avatar Dec 23 '21 00:12 ebebbington

As an alternative, this issue does not appear to happen when using the Deno.Command API.

GJZwiers avatar Mar 24 '23 19:03 GJZwiers

We should migrate to Deno.Command API because of will be soft-removed in Deno 2.0. https://deno.land/[email protected]?s=Deno.run

Hajime-san avatar May 27 '24 02:05 Hajime-san

I'll close this issue because Deno.run is deprecated and will be removed in 2.0.

Upgrade to Deno.Command API

littledivy avatar Jun 13 '24 11:06 littledivy