tcpdump
tcpdump copied to clipboard
child_cleanup: reap as many child processes as possible
Hello, one of our customer reported zombie processes when running tcpdump under load with -C/-z and I've been able to reproduce. Here's a patch suggestion, but feel free to use something else -- waitpid is also mandated by posix so there should be no portability issue (same requirement as wait) but I've only tested this on linux.
Commit message below: Under load it's possible multiple children have been killed before we start processing the SIGCHILD signal, leaving zombie processes behind everytime we miss a process. Reap as many processes as possible instead of assuming one handler call = one process like we currently did.
Can be reproduced by running the following commands in parallel:
- tcpdump -i lo -w /tmp/test -C 1 -z /usr/bin/true
- iperf3 -s
- iperf3 -c localhost
Hello,
sorry for being annoying, is there anything I could do to help getting this merged? I'm in no particular hurry, but I tend to forget the PRs I've opened if they don't get any reply in a while so prefer striking when the iron is hot :)
Thanks!
Hello, It's been a bit over a month so it's time for my next being-the-annoying-guy here.
Once again please tell me if there's anything I can do to help get this merged: this is a real bug that affected one of our customers.
It might not be a huge bug, but leaving zombies around consumes memory and there's no harm in reaping these properly.
Thank you.
@fxlb @infrastation sorry for the direct mention, would either of you be able to take a look eventually?
It's not a huge deal but I don't like keeping PRs in limbos, if there's anything I can do I'll be happy to update anything.
Thanks!
I don't think one test on Linux is enough. There should be tests at least on macOS, some *BSD, Windows.
Is using /usr/bin/true
relevant is a test?
Thanks!
I can setup some BSD to test in the next few days if that helps, but I don't have any macOS or windows around.
The condition to produce zombies is to have multiple children die fast enough so that multiple children deaths happen by the time tcpdump processes the sigchld (new signals aren't sent until we're done processing one); so it doesn't have to be true
at all, what matters is producing lots of children quickly. Our user reproduced it with a normal log rotation script.
As far as test goes though, just confirming the signal handler reaps children as normal (doesn't hang, doesn't leave zombies behind for normal usage with single child death per signal) is probably enough -- there's no reason the loop wouldn't work if that does. It looks like to me that some of the CIs here aren't just builds but also run tests -- I assume there's some rotation test in there so it probably at least doesn't hang on tested platforms... Hard to tell for zombies though.
"Child process" is the proper term here.
Sorry for the poor naming - I didn't realize. I've renamed the subject, and amended commit message.
I had a freeBSD VM laying around, and was able to easily confirm both the problem and solution:
from compiled master branch:
# ./tcpdump -i lo0 -w /tmp/trace -C 1 -z /usr/bin/true
# iperf3 -s
# iperf3 -c localhost
# ps -edfww
3322 u0 S+ 0:06.77 `-- USER=root LOGNAME=root HOME=/root SHELL=/bin/csh BLOCKSIZE=K MAIL=/var/mail/root MM_CHARSET=UTF-8 LANG=C.UTF-8 PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin TERM=vt100 HOSTTYPE=FreeBSD VENDOR=amd OSTYPE=FreeBSD MACHTYPE=x86_64 SHLVL=1 PWD=/root/tcpdump GROUP=wheel HOST=freebsd EDITOR=vi PAGER=less ./tcpdump -i lo0 -w /tmp/trace -C 1 -z /usr/bin/true
3567 u0 Z<+ 0:00.01 `-- <defunct>
After updating to refs/pull/972/merge
, I couldn't reproduce anymore. I've confirmed pid growth so tcpdump forked and reaped its child processes successfully.
Hello,
it's been another couple of weeks, so I'm going to be annoying again.
I still don't have a windows, but the comment above the child_cleanup
function explicitly states that On windows, we do not use a fork, so we do not care less about waiting a child processes to die
with ifdefs that I read as the function isn't compiled on windows, so I don't think this particularly requires testing there.
It works on linux and freeBSD, both of which exhibit the problem without the patch and no longer do after patching.
Is there anything else you're waiting for?
Does anyone see a problem with this change?
Hello,
I had forgotten about this, there doesn't look like anyone is opposing so would it be possible to merge?
Thank you.
. . . ping?
I'm glad someone mentioned this issue. When I used - z to call the shell script I wrote when I generated more than 250MB per second (50MB without files), I encountered this problem. Even if I used kill - 9, I could not kill the zombie process. 300+zombie process have been accumulated in 4 hours. I used to think there was a problem with my script.
zombie processes are already "dead", you cannot kill them again; tcpdump needs to tell the kernel they can be cleaned up, as this patch does by calling waitpid as many times as necessary. There really isn't any workaround to this, playing games to make sure only one child process terminates at a time before the signal handler has been called is just going to slow everything down... It's much simpler to rebuild with this patch.
That should work - it won't hang, due to the WNOHANG
, and the loop will exit if there are no zombies hanging around to have their status picked up, as waitpid()
would return -1 with ECHLD
in that case.
This looks correct to me: multiple child processes exiting can result in only a single SIGCHLD, so looping with waitpid() to reap them all is correct.
After the message on the mailing list on 13/04/2023, there is no argument against this PR. Let's wait a few more days. I propose to merge it next week.
Thank you!
Thanks!