podman
podman copied to clipboard
podman exec into a "-it" container: container create failed (no logs from conmon): EOF
Common thread seems to be:
Running: podman [options] run -dti --name test1 quay.io/libpod/fedora-minimal:latest sleep +Inf
time="2021-06-16T19:33:53-05:00" level=warning msg="The input device is not a TTY. The --tty and --interactive flags might not work properly"
99e3b419a97aa408a4d0d3072bbd00579d5edd7c97790aa06d61f233cfdc1b4c
Running: podman [options] exec -ti test1 true
Running: podman [options] exec -ti test1 true ! sometimes it fails on the first, sometimes on the third
Error: container create failed (no logs from conmon): EOF
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root container
- PR #10214
- fedora-34 : int podman fedora-34 root host
- PR #10860
- PR #10688
- gce_instance:fedora : int podman fedora-33 root host
- PR #9820
- PR #9477
- instance:GCEInstance : int podman fedora-33 root host
- PR #9969
- ubuntu-2104 : int podman ubuntu-2104 rootless host
- PR #10631
And also just now in a still-live PR (my flake-xref does not handle live PRs): int podman ubuntu-2104 root host
Note: the March and April logs above have been garbagecollected, I can't confirm that the error is the same one. I'm leaving them in the report deliberately, in case it helps to have a timestamp for the start of this flake (i.e. it might not be new in June).
Edit: this is podman, not podman-remote, so it's unlikely to be the same as #7360
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root host
- PR #11101
- ubuntu-2104 : int podman ubuntu-2104 root host
- PR #10992
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root host
- PR #11219
Hmmm, I wonder if this is the same problem, in a different test? Looks suspiciously close.
podman network connect
Running: podman [options] exec -it test ip addr show eth1
Error: container create failed (no logs from conmon): EOF
Podman network connect and disconnect [It] podman network connect
- fedora-34 : int podman fedora-34 root host
- PR #11215
Another one, in yet another test. Looks like this is happening more often than I thought, because it happens in multiple tests:
Podman exec [It] podman exec --detach
- fedora-34 : int podman fedora-34 root host
- PR #11215
A friendly reminder that this issue had no activity for 30 days.
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root host
- PR #11556
- fedora-34 : int remote fedora-34 root host [remote]
- PR #11606
- ubuntu-2104 : int podman ubuntu-2104 root host
- PR #11655
- PR #11402
Podman network connect and disconnect [It] podman network connect when not running
- ubuntu-2104 : int podman ubuntu-2104 root host
- PR #11655
Podman network connect and disconnect [It] podman network disconnect and run with network ID
- fedora-34 : int podman fedora-34 root host
- PR #11834
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root host
- PR #11609
- fedora-34 : int remote fedora-34 root host [remote]
- PR #11957
- ubuntu-2104 : int podman ubuntu-2104 root host
- PR #11655
Still seeing this. int remote fedora-35 root
I'll take a stab at it. Thanks for assembling the data, @edsantiago!
while true; do
./bin/podman run --name=test --replace -dti quay.io/libpod/fedora-minimal:latest sleep +Inf
./bin/podman exec test true
./bin/podman rm -f -t0 test
done
Ran over 30 minutes but no failure. I'll have a look at the code; maybe I can come up with a theory but a reproducer would be great.
I can't reproduce on my laptop either, but on a 1minutetip f34 VM it fails instantly, on the very first try:
# podman run -dti --name=test quay.io/libpod/fedora-minimal:latest sleep 20;podman exec -it test true
8ed6f60c9a8e38d2081ece7a5471cc1a931f402170a9b0ff8f149bffb434994b
Error: container create failed (no logs from conmon): EOF
After that first time it still fails, but only once in 4-5 times. Note that it fails even without < /dev/null on either podman command.
podman-3.4.1-1.fc34.x86_64 conmon-2.0.30-2.fc34.x86_64
One more note: I think the -it is needed on exec. Without it, I can't reproduce the failure.
mheon PTAL
One would think this is a race between podman run creating the container and launching conmon, and podman exec gets to talk to conmon before it knows there is a container,causing some issues.
Well, except that it's not always the first exec. This log shows the first three execs working, then it fails on the fourth.
Very difficult to track this down without a repro - we need to know what's going on with Conmon such that it's blowing up (personally I think Conmon is probably either segfaulting or just printing the error to the journal and exiting without reporting the real error to Podman). Might be logs in the journal that will help us?
@rhatdan It's not actually container create that's failing, that's a bad error message. We're trying to make a Conmon for the exec session but Conmon is failing with no logs as to why.
@mheon see my 1minutetip f34 VM comment above. It reproduces reliably.
Here's one in the brand-new ubuntu-2110
Podman network connect and disconnect [It] podman network disconnect when not running
- ubuntu-2110 : int podman ubuntu-2110 root host
- PR #12305
- PR #11795
- ubuntu-2110 : int remote ubuntu-2110 root host [remote]
- PR #12281
Podman network connect and disconnect [It] podman network disconnect
- fedora-34 : int remote fedora-34 root host [remote]
- PR #12348
- ubuntu-2110 : int podman ubuntu-2110 root host
- PR #12305
- ubuntu-2110 : int podman ubuntu-2110 rootless host
- PR #12256
Podman exec [It] podman exec terminal doesn't hang
- ubuntu-2110 : int podman ubuntu-2110 root host
- PR #12449
- ubuntu-2110 : int remote ubuntu-2110 root host [remote]
- PR #12301
Podman network connect and disconnect [It] podman network disconnect
- fedora-34 : int remote fedora-34 root host [remote]
- PR #12380
- ubuntu-2110 : int podman ubuntu-2110 root host
- PR #12305
Fresh one in ubuntu 2110 root. Curious thing: once it happens one time, it seems to happen on a bunch more tests afterward.
Here's one where it fails with bad exit code, but the conmon error isn't present:
# podman [options] run -dti --name test1 registry.fedoraproject.org/fedora-minimal:34 sleep +Inf
time="2021-12-08T15:27:00Z" level=warning msg="The input device is not a TTY. The --tty and --interactive flags might not work properly"
ce72bce58b4ef3d0215bc5d805594b94f8ae18e1eee558471358f6a682846df3
# podman [options] exec -ti test1 true
# podman [options] exec -ti test1 true <--- this is the one that seems to fail
...
? Failure [4.220 seconds]
Podman exec
/var/tmp/go/src/github.com/containers/podman/test/e2e/exec_test.go:16
podman exec terminal doesn't hang [It]
/var/tmp/go/src/github.com/containers/podman/test/e2e/exec_test.go:334
Expected
<int>: 129
to match exit code:
<int>: 0
Podman exec [It] podman exec terminal doesn't hang
- fedora-34 : int podman fedora-34 root container
- PR #12541
A friendly reminder that this issue had no activity for 30 days.
@edsantiago is this still an issue?
Last seen 12-21:
Podman init containers [It] podman ensure always init containers always run
- fedora-35 : int podman fedora-35 root host
- PR #12662
Podman network connect and disconnect [It] podman network connect and run with network ID
- fedora-35 : int podman fedora-35 root host
- PR #12659
- ubuntu-2104 : int remote ubuntu-2104 root host [remote]
- PR #12602
- PR #12592
Maybe Santa's elves fixed it over break. Or maybe our CI use has been low due to so many of us on PTO. (Since you removed the stale-issue tag, I'm pretty sure your guess is the same as mine).
A friendly reminder that this issue had no activity for 30 days.
Podman exec [It] podman exec terminal doesn't hang
- ubuntu-2110 : int podman ubuntu-2110 root host
- PR #12966
A friendly reminder that this issue had no activity for 30 days.