Samuel Karp

Results 341 comments of Samuel Karp

Errors like `"OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown"` is a key indicator of https://github.com/containerd/containerd/issues/10589

https://github.com/containerd/containerd/pull/10560 should help get the state of the task within the shim. This is only on `main` right now, but if you're querying a 1.7 shim you'll want to include...

> I'm wondering if this only reproduces when the container is sharing a pid namespace with another container. As far as I can tell, the pid namespace isn't shared in...

I ran @bobbypage's repro and reproduced the problem fairly easily on a build of 1.7.15. I've modified the shim with extra logging per https://github.com/containerd/containerd/commit/511ed30d386fa8fe600959f307b7eb1dccdf3fc1 (https://github.com/containerd/containerd/commit/82c426fa07e80ea5988379f29e278247bde1265c). ``` samuelkarp@gke-rapid-default-pool-91865e43-9wka ~ $ sudo...

How to run and output ``` samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo PATH=$PATH ./script/setup/install-failpoint-binaries + bin/cni-bridge-fp + bin/containerd-shim-runc-fp-v1 + bin/runc-fp samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo -E PATH=$PATH make integration EXTRA_TESTFLAGS='-run TestIssue10589' + integration INFO[0000] Using the following...

> which is expected – means the shim prevented the new exec from being started since the init has exited. Yay! ~We can update the test to pass in this...

@laurazard I've updated the logic here to not block when exec2 fails to start and run against the latest in #10651, but I'm still seeing the task have an incorrect...

This PR will remain WIP until https://github.com/containerd/containerd/pull/10651 is merged, at which point I will rebase it and mark it ready. Please feel free to review it now, however.