ct
ct copied to clipboard
gcov cannot write coverage data from child processes
At the end of a successful test, the test process itself ends normally, but ct kills any remaining child of the test process with SIGKILL
. This is a problem if that child is executing code under test, because it means gcov has no chance to write coverage data for that process.
In #12 there's some discussion of how gcov writes coverage data.
@kr, we can close it down since it was fixed in beanstalkd.
As slightly related improvement can I suggest following patch? https://github.com/kr/ct/pull/19
This still seems like a gotcha for other projects using ct, so I'd like to keep it open.
I think the ideal behavior here from ct would be:
- if the test process finishes the test successfully, it kills all descendants with SIGTERM
- it waits for them to complete
- if they don't all complete within some time limit, ct then kills the whole process group with SIGKILL (as it already does)
Some things that make this difficult:
- We can't do most of this in the test driver process, since we're concerned with children of the test process. The test process itself must be the one to wait.
- While ct does have the chance to execute logic in the test process after the test itself has returned successfully, it doesn't know what the children are or even how many there are.
- After sending SIGTERM, the test process isn't doing anything but waiting for its children to exit. Arguably ct could begin another test during this time, but it would take some doing to make that happen, since the driver process currently uses termination of the test process to indicate that the test is done and therefore it's okay to start another.
My rough plan for how to do this happens mostly in start
at the end of the child branch. Here's what it does currently:
... // setup test process state
t->f(); // run the test
if (fail) {
ctfailnow(); // send SIGABRT to self, indicating failure to driver
}
exit(0); // indicate success to driver process
I'd like to add some logic immediately before that last line.
-
Set a timer to call
exit(0)
after a short timeout. (Maybe 500ms?) This will ensure that the process exits successfully eventually, no matter how long its children take. -
Ignore
SIGTERM
in the current process (the test process). This is okay because this process has nothing left to do but signal its children and wait for them, and we've already ensured it'll exit soon. -
Signal the whole current process group with
SIGTERM
. This will signal all descendants without having to enumerate them. It'll also signal the current process but we just ignoredSIGTERM
so that's okay. -
Call
wait
in a loop until there are no children. -
Finally,
exit(0)
as before.
If the children exit promptly, these steps will complete in order. If any child takes too long, the timer will fire and the test process will just exit. Either way, the driver process will kill the process group with SIGKILL
as it already does, cleaning up any stragglers. (Note that we need this final SIGKILL
even if all the immediate children exited promptly, because there could be deeper descendants still running in the process group.)