bollard
bollard copied to clipboard
container exits but output stream remains open
I am trying to capture output from a running container using Docker::attach_container
but it appears that the output stream does not close even though the container exited (and which was reported via the wait_container
stream). The code at issue is in this PR https://github.com/pantsbuild/pants/pull/16766.
That code monitors the output stream returned by attach_container
and the stream from wait_container
for the exit code with a tokio::select!
. When the container exits, the code breaks out of the monitoring loop and then tries to see if any output remains on the output stream, but that stream never closes so the code hangs there.
Am I somehow not using the APIs correctly? (The machine is macOS 12.5.1 with Docker Desktop 4.11.1.)
So, with the 0.13
version of Bollard, wait_container
will only emit an Err
if the Docker server responds with a non-200 status code - for example, if the container doesn't exist or some other network related error. In that version of Bollard, you could capture the status code emitted by the wait_container
operation and handle it how you need to.
However, I've just merged a PR that should change that behaviour https://github.com/fussybeaver/bollard/pull/250 - in the latest master, streams that return an error in the JSON will also return an error in Bollard.
Let me know in your testing if that works for you.
So, with the 0.13 version of Bollard, wait_container will only emit an Err if the Docker server responds with a non-200 status code - for example, if the container doesn't exist or some other network related error. In that version of Bollard, you could capture the status code emitted by the wait_container operation and handle it how you need to.
To clarify, the issue that I am seeing is not with the wait_container
stream. That stream actually does report the container exit correctly. The issue is that the attach_container
output stream does not close once the container has gone away. This means that, after the container exit is reported on the wait_container
stream, the code that checks the attach_container
output stream hangs because the attach_container
output stream never closes.
The context is that the project in question is a build tool that needs to capture all output from running a build action in a Docker container. I observed a race condition in testing where my code would fail to capture some output. My thought was to check the attach_container
output stream to make sure there was no "left-over" output remaining on that stream after the container exit was reported.
Regarding https://github.com/fussybeaver/bollard/pull/250, I will try it out with the hope that maybe that it will change the behavior of the attach_container
output stream.
I tried the latest master
branch (at cfebd306a4
). No change in behavior; the attach_container
still remains open even after the container has exited (and as reported on the wait_container
stream).
I'm having similar issues - or so I think, at least. I may have a repro case too, one that isn't particularly big, either. I have to clean it up a little first, it's in a private repo at the moment, have to lift it out of there.
Ok, thanks for explaining a bit. That does sound quite different. I would appreciate a minimal test case -- @antoinert do you have any idea why this might be happening?
Ok, thanks for explaining a bit. That does sound quite different. I would appreciate a minimal test case -- @antoinert do you have any idea why this might be happening?
https://github.com/fussybeaver/bollard/pull/254 has a minimal reproduction.
Thanks for the reduced example.
I took a quick look, and can confirm that the streams are hanging. Some playing around though confirmed there is a timing issue with the running container and the client that attaches the output. If you change the cmd
to vec["sleep", "3"]
or execute attach_container
prior to start_container
, the streams are closed properly. So, there's a workaround there.
I'll be taking a closer look at the hyper logs to see if there's something else I can see.
For my own use case, I ended up using executions (which return the output stream), and so this issue is not an issue for me any more. I can close unless someone else has a reason to keep it open.
In any event, it seems like the proper way to attach to a container would be to call attach_container
first before calling start_container
?
Let's keep the ticket open.
I tried comparing the hyper / tokio logs between the two, but didn't see anything of notice. The server should have the responsibility of closing the stream, so it should be reproducible on the moby codebase - that's my next step regarding this ticket.
I have done some more due diligence to this issue to try and understand what is going on, and replicated the example PR using the golang moby client SDK: https://github.com/moby/moby/compare/master...fussybeaver:moby:ND-attach-ordering-issue?expand=1
One thing struck me initially - removing the stream=true
parameter means nothing is emitted. The documentation does suggest this though https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerAttach
As demonstrated in this ticket (and in this closed moby ticket), the intended useage of the API is containerCreate
, containerAttach
, containerStart
, containerWait
. In moby, when containerAttach
is triggered for a stopped container, the behaviour of the client SDK is to hang on the attach call. According to a similar bug ticket on the moby project, this is by design, in case you call attach before a container has started. I verified the hung client would continue by restarting a previously stopped container.
Similarly in Bollard, I assume that a stream will only close when a container is stopped. We should probably change the documentation to make this point clear when using this API. I'm open to any suggestions though on making this more ergonomic.