proposal: cmd/go: add a flag to use stdin as a cancellation side-channel
The go command executes a lot of subprocesses as it runs, including:
- the Go compiler and linker, and in cgo-enabled builds the C compiler and linker,
- version-control tools to download modules that aren't available from a proxy,
- the
vettool to analyze programs, - test binaries built for packages under
go testandgo test -fuzz=., and - binaries executed by
go run(#54868 notwithstanding).
Some clients of the go command end up starting a command, letting it run a while, and then wanting to cancel its execution. For example, gopls may start running go list to evaluate a program being edited, but then cancel that command due to another edit to the files in the user's workspace or a failure in some other concurrent processing that renders the go list result irrelevant.
If the go process itself is forcibly terminated using os.Kill (via either (*os.Process).Kill or exec.CommandContext), its subprocesses may continue to execute for an arbitrarily long time, leaking resources and (on Windows, at least) potentially preventing cleanup of temporary files and directories.
On Unix platforms, we can instead request a clean shutdown using os.Interrupt. (The go command may or may not respond appropriately to that signal today, but in principle it can and that's what matters in terms of the public-facing API.) However, os.Interrupt does not work on Windows, and per discussion on #6720 and #46345 we don't know of a general way to make it work. (At the very least, it would require os/exec to make invasive and likely-incompatible assumptions about consoles and process groups.)
Since os.Kill leaks resources and os.Interrupt isn't always available, we need some other way to request cancellation of a running go command.
Fortunately, we have a way out: as far as I am aware, every platform that supports os/exec in Go also supports inter-process pipes, and cmd/go to my knowledge does not use stdin for anything today.
I propose that we add a new flag to the go command, -stdin=keepalive, that will cause it to read from (and discard bytes from) stdin until EOF, then make a best effort to cancel all subprocesses, clean up, and exit as soon as possible.
The choice of -stdin=keepalive leaves open the possibility of using stdin in other ways in the future, while being clear about both its role (“keepalive”) and mechanism (“stdin”).
(CC @golang/tools-team)
Operating systems do allow for creating a child process in a way that it doesn't leave any orphan behind when killed.
On Linux, the standard way is to call prctl(PR_SET_PDEATHSIG, SIGKILL); after fork, before exec. On Windows, you create a job with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE and assign the project to that job.
The fact that we can't do either of the above with exec.Command and the fact that we have a clear use case that needs it, seems to point to the direction of adding this capability to exec.Command, rather than providing a ad-hoc workaround that requires writing, debugging, testing and maintaining code that does cleanups which kernels are better at doing.
That functionality is already available via syscall.SysProcAttr for systems that support it. Unfortunately, not all systems support it, so I don't see a feasible way to add it to the os/exec package. In particular, it's not supported on MacOS, which is a widely used Go development platform.
How would this work with 'go run', which delegates its stdin/out/err to the target process?
@rasky
rather than providing a ad-hoc workaround that requires writing, debugging, testing and maintaining code that does cleanups which kernels are better at doing.
I see several problems with that approach:
- The steps to perform a clean shutdown may involve more than just killing processes. (For example, they may include deleting temporary files or directories.) So even if we chain the processes that way, we still don't get a completely clean shutdown.
- It isn't obvious that that approach is even possible on all platforms that support
cmd/go. (Notably, I don't see a way to do it on macOS, Solaris, or Plan 9.) - If we did try to use platform-specific cleanup mechanisms, we would need to test and debug each different mechanism. That can be much more involved than testing and debugging one platform-independent mechanism even if that one mechanism is somewhat more complex than any individual platform-specific mechanism.
We did use the platform-specific approach for file locking in the module cache, but there we didn't have a clean cross-platform alternative. Here we do.
@adonovan
How would this work with 'go run', which delegates its stdin/out/err to the target process?
Ideally go run should exec the requested program instead of running it as a subprocess (that's #54868), and then it would be up to that program to watch the keepalive from there. (You wouldn't be able to pipe actual data to such a program because go run --stdin=keepalive would swallow some arbitrary number of leading bytes, but you could at least use the same “open pipe” mechanism for both cmd/go and the underlying program.)
In the interim, on platforms that support SetDeadline for pipes I think we could arrange for cmd/go to stop reading from the pipe just before starting the subprocess, so that any data sent on stdin after the subprocess writes to stdout or stderr would correctly go to the subprocess. But that also seems like something of an edge-case...
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
It's pretty common to call go command through os/exec + context. Often, that happens behind the scene through libraries like go/packages. If the flag is added, should the libraries be individually updated to utilize the new flag/pipe?
Will there be any way to wire this through os/exec, or should there be a special package only for go command execution?
Can we use this with go test too? It's rare to write a test utilizing stdin but nothing prevents using it.
should the libraries be individually updated to utilize the new flag/pipe?
I suspect this thread was at least in part motivated by some of the cancellation problems we've seen in gopls, which uses go/packages, so I expect we would update that go/packages (actually its internal go-command helper package) to use the new mechanism.
Will there be any way to wire this through os/exec
The parent process would set exec.Cmd.Stdin to an open file descriptor (os.File) returned by, say, os.Pipe, and then close the other end of it in response to context cancellation. It might be useful for the flag documentation to link to an example of the correct code (in the absence of a standard package that would host it).
Can we use this with go test too? It's rare to write a test utilizing stdin but nothing prevents using it.
A test that defines its own main can do whatever it wants with stdin---a similar situation to 'go run'.
If the flag is added, should the libraries be individually updated to utilize the new flag/pipe?
Yes!
Will there be any way to wire this through
os/exec, or should there be a special package only forgocommand execution?
That's #50436, for which I am hoping to merge an implementation next week (assuming that the API revisions are approved).
The configuration would ideally look like:
cmd := exec.Command(…, "-stdin=keepalive", …)
stdin, err := cmd.StdinPipe()
if err != nil {
…
}
cmd.Cancel = func() error { return stdin.Close() }
It seems like we should disallow go run -stdin=keepalive. The keepalive only makes sense when stdin is going to send zero bytes and then close. There's no good answer for what to do if stdin has real data on it.
Disallowing go run -stdin=keepalive seems fine for now.
(I can think of reasonable ways one might use it, but use-cases are pretty niche and the implementation is simpler without it.)
Does anyone object to making this change?
I think this is a good idea. Perhaps a bit surprising to use this stdin mechanism directly, but I also don't imagine many people will need to do that themselves, instead relying on go/packages or not needing proper cancellation in practice.
Out of curiosity, doesn't the same problem bubble up to programs like gopls? I understand that the LSP protocol has a "stop" message which is a stand-in for os.Interrupt, so that's likely enough. For other programs which need to gracefully cancel cmd/go, they might need to also support the same stdin cancellation mechanism for their users.
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group
Change https://go.dev/cl/456116 mentions this issue: cmd/go: use Cancel and WaitDelay to terminate test subprocesses