libct: setns process recreate cmd object before calling the first start
Currently there is an golang issue https://github.com/golang/go/issues/76746 . If the cmd.Start is called twice (the first call was unsuccessful) then Cmd.Wait can return an error (because goroutines' pipes will be closed in the first try). However, the second cmd.Start can return nil. After https://go-review.googlesource.com/c/go/+/728642 is merged, then the second call will return an error. So we need to recreate the Cmd object. We can not simply copy it https://go-review.googlesource.com/c/go/+/728642/comments/a837fe92_a5fc5f06 .
Fixes #5060
Hmm, the code to create a copy of the cmd is kinda fragile, and I guess what Alan suggested is a better way, i.e.
use some other structure to hold all the information needed to populate a Cmd, and create the Cmd just before you call Start.
One more alternative is to check whether CLONE_INTO_CGROUP is available, but I'm afraid the check is going to be too expensive, and caching its result is not very reliable either.
I guess what Alan suggested is a better way
Hello! Something like this https://github.com/opencontainers/runc/commit/4cdc0d64123bf2afb652fdf08842895261d0924b ? I have made this as a Container method.
@everzakov Have you verified whether this issue actually affects runc? If so, why aren’t we seeing any errors on almalinux-8?
In fact, runc doesn’t call cmd.Wait() directly—it uses cmd.Process.Wait() instead to wait for the child process.
Regarding the Go PR (728642), I think the change might be overly strict for users and could potentially be a breaking change.
In fact, I changed the CgroupFd to a random num in runc code, and I can't see any errors for runc, for example:
p.cmd.SysProcAttr.CgroupFD = 100 //int(fd.Fd())
lifubang@acmcoder /opt/ubuntu $ sudo ./runc exec test echo hello
hello
lifubang@acmcoder /opt/ubuntu $ sudo ./runc --debug exec test echo hello
...
DEBU[0000]libcontainer/process_linux.go:359 libcontainer.(*setnsProcess).prepareCgroupFD() using CLONE_INTO_CGROUP "/sys/fs/cgroup/user.slice/user-1000.slice/test"
... libcontainer.(*setnsProcess).startWithCgroupFD() exec with CLONE_INTO_CGROUP failed: fork/exec /proc/self/fd/6: bad file descriptor; retrying without**
...
hello
In fact, runc doesn’t call
cmd.Wait()directly—it usescmd.Process.Wait()instead to wait for the child process.
Actually there are some calls cmd.Wait() - https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L152 and https://github.com/opencontainers/runc/blob/v1.4.0/libcontainer/process_linux.go#L524 .
I can't see any errors for runc
Yeah, the main problem to find this fault in runtime is that all cmd.Wait calls' errors are ignored. Also, it is often preceded by process KILL. So, this error can be found only in tests. You can try to run TestEnter test with your change. The error should be the same as from issue.
Regarding the Go PR (728642), I think the change might be overly strict for users and could potentially be a breaking change.
I have proposed another go pr 728660 . In my PR it will be possible to call the same os/exec Cmd object if the first try was unsuccessful. Also, Cmd.Wait does not return the error. However, i think 728642 will be merged.
You're right — we shouldn’t reuse the exec.Command object to restart the process.
Just a discuss, a simple fix would be to move the retry logic into container_linux.go? Maybe in here:
if err := parent.start(); err != nil {
return fmt.Errorf("unable to start container process: %w", err)
}
Additionally, adding a test case to ensure the patch works correctly would be appreciated.