ginkgo icon indicating copy to clipboard operation
ginkgo copied to clipboard

Propagate signals to child processes

Open max-melentyev opened this issue 6 months ago • 4 comments

Closes #720

Hey, this fixes the behavior when main ginkgo process receives SIGINT/SIGTERM. Without this change the signal is only handled by the main process and handler remains in InterruptLevelCleanupAndReport state. This can be tested by interrupting a process with kill rather than with ctrl+c (bash and other shells send signal to a foreground process group and this is why it doesn't require signal propagation).

max-melentyev avatar Jun 23 '25 21:06 max-melentyev

hey sorry for the delay. this seems to break the behavior for ctrl+c - I now see a bunch of

Failed to signal child processes: no such process

when I ctrl+c Ginkgo's own integration suite.

onsi avatar Aug 22 '25 03:08 onsi

Hm. It appears to be trickier than it seemed to be. This probably happens when you use go run ./ginkgo -p ./..., in this case PID of the parent go process is used as PGID. Though it's not a big deal to find the correct PGID. The main issue is that shell's ctrl+c sends signal to all processes in pgroup. But we'd want to avoid sending duplicate signals in this case, because they will abort tests without cleanup.

I might have time to look at this in future. Meanwhile I use this workaround:

package main

// This is a workaround for ginkgo runner until this fix is merged:
// https://github.com/onsi/ginkgo/pull/1565
//
// Without this if k8s decides to evict the e2e runner pod,
// signal is not propagated to the child process, they don't cleanup resources
// and then killed forcefully.
//
// Usage:
//     signal-propagator ginkgo [ginkgo args...]

import (
	"os"
	"os/exec"
	"os/signal"
	"syscall"
)

func main() {
	ch := make(chan os.Signal, 1)
	signal.Notify(ch, os.Interrupt, syscall.SIGTERM)
	go func() {
		for osSignal := range ch {
			sysSignal := syscall.SIGTERM
			if osSignal == os.Interrupt {
				sysSignal = syscall.SIGINT
			}
			// Minus sign is for PGID rather than PID.
			err := syscall.Kill(-syscall.Getpid(), sysSignal)
			if err != nil {
				os.Stderr.WriteString("Failed to signal child processes: " + err.Error() + "\n")
			}
		}
	}()

	cmd := exec.Command(os.Args[1], os.Args[2:]...) // #nosec G204
	cmd.Stdin = os.Stdin
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	if err := cmd.Start(); err != nil {
		os.Stderr.WriteString("Failed to start: " + err.Error() + "\n")
		os.Exit(1)
	}
	if err := cmd.Wait(); err != nil {
		if cmd.ProcessState != nil && cmd.ProcessState.ExitCode() != 0 {
			os.Exit(cmd.ProcessState.ExitCode())
		}
		os.Stderr.WriteString("Failed to wait: " + err.Error() + "\n")
		os.Exit(1)
	}
	os.Exit(cmd.ProcessState.ExitCode())
}

max-melentyev avatar Aug 26 '25 16:08 max-melentyev

Maybe it would be easier to add a flag to enable signal propagation, so that it can be set when ginkgo is not running in interactive shells. I can add this if you think it's ok.

max-melentyev avatar Aug 26 '25 16:08 max-melentyev

ooooh, yeah a flag makes sense. yes please do give it a go.

onsi avatar Aug 26 '25 16:08 onsi