otp erts: kill spawned child processes on VM exit (unix only)

This unix-only change ensures that all external child processes spawned with open_port are terminated along with their controlling process.

Began with discussion in https://erlangforums.com/t/open-port-and-zombie-processes .

In-depth notes are posted as a blog post.

Feb 17 '25 23:02 adamwight

All committers have signed the CLA.

Feb 17 '25 23:02 CLAassistant

CT Test Results

3 files 141 suites 50m 22s ⏱️ 1 613 tests 1 563 ✅ 50 💤 0 ❌ 2 331 runs 2 261 ✅ 70 💤 0 ❌

Results for commit 6f9f611d.

:recycle: This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

Feb 17 '25 23:02 github-actions[bot]

Hello!

I think that we can move forward with this. There is no need to have an option to disable it for now (unless our existing tests shows that it is needed...), but there needs to be testcases to test that it works as expected on both Unix and Windows.

I do wonder however if we should send some other signal than KILL? Should we allow the child to be able to catch it and deal with it if they want to?

Feb 20 '25 08:02 garazdawi

I do wonder however if we should send some other signal than KILL? Should we allow the child to be able to catch it and deal with it if they want to?

Good point, TIL that sigkill is untrappable. Looking at erlexec for precedents, its default behavior is to send a sigterm to the direct child process, wait a configurable 5 seconds, and then send sigkill.

I experimented a bit locally to see if sh would react to a sigterm by stopping its own children, and it does not—so in order for spawn commands to benefit as well as spawn_executable, I would stick with the choice to send a signal to the entire child process group, but send TERM to be more polite. I like that this also offers the descendant processes a second and more straightforward workaround to prevent the termination if needed.

Feb 20 '25 08:02 adamwight

Now includes a test for normal erl shutdown (halt() self), and I'll try to also write one for abnormal shutdown (receiving SIGKILL). There's a small amount of race condition remaining in the test which I'd like to clean up—perhaps by direct communication from the grandchild process (new test utility file_when_alive) back to the test executor, I'm open to suggestions about how to do that.

There are a few other flapping tests, I don't think this is related to my patch but can't say for sure...

Running the tests on Windows is not going well for me, and it seems there's no CI for that yet? In theory my patch and the test will also run on win32 but I'd like to see that happen.

Feb 23 '25 13:02 adamwight

perhaps by direct communication from the grandchild process (new test utility file_when_alive) back to the test executor, I'm open to suggestions about how to do that.

If the grandchild is an Erlang node, it could communicate via Erlang distribution? Otherwise a file seem reasonable.

Running the tests on Windows is not going well for me, and it seems there's no CI for that yet? In theory my patch and the test will also run on win32 but I'd like to see that happen.

No, there is no github CI for that yet. I have a branch that I work on from time to time to try to bring it in, but the tests are not stable enough yet. Maybe you can temporarily use it as a base for your changes and you should atleast be able to see if your tests pass or fail?

Feb 24 '25 08:02 garazdawi

A couple of other things that popped into my mind:

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

like that this also offers the descendant processes a second and more straightforward workaround to prevent the termination if needed.

I know that there are users that rely on being able to spawn daemon processes through os:cmd("command &"). This is tested in os_SUITE:background_command/1, but the test will not catch what happens when the emulator dies. I'm unsure what the user want to happen there, I can see them both wanting it to die and survive... though since the current behaviour is that it survives I think we need to keep that.

Feb 24 '25 08:02 garazdawi

Maybe you can temporarily use it as a base for your changes and you should atleast be able to see if your tests pass or fail?

Great! I'm working on that now, and learned that my proposed feature needs to be reimplemented separately for the win32 spawn driver.

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

That makes sense, the same principle applies IMHO and it feels consistent to attempt a direct termination any time Erlang will lose its connection to the child. I'll add this.

os:cmd("command &").

I see... Interestingly, the "&" in that test is only relevant for allowing os:cmd to return immediately, the shell job control seems to be unimportant. In other words, the test is equivalent to calling open_port and not waiting for the process to finish, so this syntax is more a convenience than a special use case. But it would definitely indicate an intention to start a daemon with no direct link to Erlang, +1 that we should respect this usage!

As a tourist to the BEAM, all I can do is describe the options but I don't have instincts for which is the best way to go. We could preserve this "&" usage by only killing the immediate child process, which would be the shell. This still offers some benefits, since the application developer may be able to call open_port with spawn_executable, making their process the immediate child and causing it to be cleaned up without needing a wrapper script. It also simplifies reasoning about the "process group", killing exactly one child process is much more predictble.

Feb 26 '25 16:02 adamwight

To some extent, I think there's a bigger problem here where we could have a whole new API for managing external processes - the current port API is neither powerful nor ergonomic. This change to kill things proactively is definitely a good one, but if we're looking more into that, I think things could be improved dramatically.

As @garazdawi mentioned, today it's not even possible to easily kill the process once you spawn it, and port_close will just close the stdin.

Feb 26 '25 20:02 michalmuskala

I found that shell "&" assigns the background job to a new process group, which IMHO means that killing children by process group is back on the table. For now however, my patch is rewritten to kill only the direct child process.

The latest branch also kills a port's child during port_close.

Splitting this responsibility between the main VM process and the forker is causing a memory leak (and a leaky abstraction), and I'm imagining this can be resolved by sending another protocol message to the forker to allow it to perform cleanup such as killing the process, then freeing memory used to track the child. Introducing this new message has some small overhead but I don't see any obvious, existing means for the forker to detect that the port was closed by beam.

we could have a whole new API for managing external processes

+1 that direct OS process management could be a nice addition to the core libraries, but the current iteration can be done without larger changes to the opaque port concept.

Feb 26 '25 23:02 adamwight

Splitting this responsibility between the main VM process and the forker is causing a memory leak (and a leaky abstraction), and I'm imagining this can be resolved by sending another protocol message to the forker to allow it to perform cleanup such as killing the process, then freeing memory used to track the child. Introducing this new message has some small overhead but I don't see any obvious, existing means for the forker to detect that the port was closed by beam.

Yes, this seems like a good approach.

Feb 27 '25 11:02 garazdawi

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

Although I think this is a good idea, when I made the change it turned out to be too aggressive and causes a lot of existing tests to fail (example job output). Looking at a simplified outline for one test, -suite port_SUITE -case output_only:

Port = open_port({spawn, "port_test -h0 -o outfile"}, [out]),
Port ! {self(), {command, "echodata123"}},
% race here?
Port ! {self(), close},
receive {Port, closed} -> ok end.

test_server:sleep(500)
{ok, Written} = file:read_file("outfile"),

The test finds that outfile was never written and the last match fails with {badmatch,{error,enoent}}. My theory is that the port close command immediately kills the spawned port_test OS process before it can begin its work. Adding another 500ms delay before closing the port at the "race here" comment indeed fixes the test, adding to my suspicion that my patch has created a race condition.

I'm open to suggestions about how to proceed! One could argue that the tests have always been risky, and relied on an undocumented behavior of the port to detect stdin closure, complete its process politely before closing and return closed. To fix the tests, port_test could be modified to eg. send a "done" message back to the test runner which would be used like so:

Port ! {self(), {command, "echodata123"}},
receive {Port, {data, "done."}} -> ok end.
Port ! {self(), close},
receive {Port, closed} -> ok end.

But it feels like application developers have probably been making the same assumption, and they have felt safe sending the close command without expecting dramatic and immediate side-effects? Are we now talking about a breaking change?

Mar 06 '25 07:03 adamwight

Good catch, I did not think of that. If we want that behaviour we can later add a port_kill (or something similar, I too have been thinking of doing a new API for external processes) that would not only close, but also kill the child process. So let's leave it at only killing processes when Erlang itself exits for now.

Mar 06 '25 08:03 garazdawi

Ready to review. This branch is supposed to:

Kill all child processes when the VM exits.
Independent of whether the exit was clean or a crash, and of whether flushing was disabled.
Only works on unix for now.

In follow-up work I might try to implement for win32, and will play with kill and kill_group options to open_port.

Mar 06 '25 23:03 adamwight

I haven't read this PR in detail, but wanted to suggest sending SIGHUP instead of SIGTERM. Both terminate the process unless caught, but SIGHUP is a better match IMO for this scenario.

Mar 07 '25 13:03 mikpe

suggest sending SIGHUP instead of SIGTERM. Both terminate the process unless caught, but SIGHUP is a better match IMO for this scenario.

I could imagine this to be true, and HUP has the historical advantage of having nearly unchanged semantics as in the previous assumption: that polite apps quit when their stdin is closed. Hanging up is already nearly identical to closing pipes, see nohup.

On the other hand, many existing daemons trap HUP to gracefully reload. And apps robust against stdin closure might also ignore HUP.

If I had to personify the two signals, my feeling is that HUP would be a vague "goodbye!" while TERM is an explicit but amiable "please terminate yourself now." How to make such a decision! If you were on a desert island with only one unix signal…

Mar 07 '25 15:03 adamwight

Thanks for making this, I think the code looks fine except one comment that I added.

As this PR is a potentially breaking change, it will be part of the next major release, that is Erlang/OTP 29.

Only works on unix for now.

Before merging this we need to atleast investigate whether it is possible to do something similar on Windows or not. We want the behaviour of the different OSs to be as similar as possible, however this particular area is already full with platform specific stuff, so it is not super important.

Mar 10 '25 19:03 garazdawi

Before merging this we need to atleast investigate whether it is possible to do something similar on Windows or not.

For sure. I've installed the win32 development environment and can compile a release, but still a bit stuck on running tests. It would be possible to implement the feature using manual smoke testing on the command line in a pinch. I've also been waiting for the unix behavior to solidify so that I'm not working in windows any more than necessary—I feel comfortable going ahead with that work now.

Mar 10 '25 21:03 adamwight

My last MS programming was a DOS graphics library, so please take my findings with a grain of salt. Here's what I learned from RTFM:

TerminateProcess should be avoided, this is like _exit in unix, it cannot be blocked, no polite cleanup can be done, and global state of shared DLLs may be corrupted.

The nice way to shut down a process is to use ExitProcess, but this can only be called by the process itself. The typical, recommended pattern is that a custom process and its controlling process will both call RegisterWindowMessageA to reserve an opaque message ID based on a shared string constant or by passing the ID directly; the controlling process will send the message using BroadcastSystemMessage and then the controlled process will respond by calling ExitProcess on itself. I don't see how to apply this to our case however, because the child is a black box and there's no way to teach an arbitrary application to listen for this custom message.

Then we get into murkier territory that I don't quite understand: I think that a process can open one or more windows. I don't know whether the main process itself counts as a window even if it's "hidden" but I think it does. A process is made of up one or more threads. There are a variety of library methods to send a message to a window or a thread, and in start_erl we have a precedent for calling PostThreadMessage on the thread created by CreateProcessW, but this relies on our own custom process listening for WM_USER. For a generic process, I think we want to send a WM_CLOSE message to the main thread.

We can start by tracking spawned processes and sending WM_CLOSE from the VM during halt, and then as follow-up work the whole arrangement could be improved by perhaps creating a forker process like erl_child_setup on unix, and letting it do the cleanup if the VM crashes. Maybe the forker can be pulled up a level so that most of its logic is shared between win32 and unix.

Mar 14 '25 08:03 adamwight

Before merging this we need to atleast investigate whether it is possible to do something similar on Windows or not. We want the behaviour of the different OSs to be as similar as possible, however this particular area is already full with platform specific stuff, so it is not super important.

@garazdawi Just to be clear, I gave it a try and I found that I don't have enough relevant experience to safely implement the win32 port of this feature. My current thinking is that it's a heavier lift than expected and the most correct approach would be to rewrite erl_child_setup so that it can be used across platforms. Should we go ahead with the unix-only patch for now?

Oct 06 '25 07:10 adamwight

I’m on parental leave, so will assign this over to John to handle.

Oct 06 '25 09:10 garazdawi

My immediate reaction is to not try to shove a square peg into a round hole, so I concur with @garazdawi's point that this area is already full of platform-specific things. We'll discuss it a bit internally but I'm definitely leaning towards including just the Unix-only stuff.

Oct 06 '25 13:10 jhogberg