Connection hangs over tramp
Hi, thanks for this excellent package, the recent async refactor is exciting.
After updating, I'm trying to bring up the container list in a remote tramp buffer (to get running containers on a remote sever).
This used to wrok fine, however, since upgrading I get stuck with the following:
Tramp: Opening connection docker volume ls --format="[{{ json .Name }},{{ json .Driver }},{{ json .Name }}]" for <REMOTE> using ssh...done
Running: docker network ls --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
Which hangs until I quit with C-g.
Seems like the correct ssh command is not running, and we are trying to connect with docker ls rather than running docker ls on the remote host (I could be misinterpreting, not super familiar with tramp).
Not sure if this is a limitation of the new aio usage.
I think the issue here is specifically with docker-network-update-status-async when running (run-hooks 'docker-open-hook) as part of opening the docker transient. I can call docker-network-update-status-async directly, and it will update , but when run as a hook, it hangs on getting the network status.
When running docker-network-update-status-async in a remote buffer:
Running: docker network ls --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
#s(aio-promise nil nil)
Finished: docker network ls --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
Running: docker network ls --filter dangling=true --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
Finished: docker network ls --filter dangling=true --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
When running (run-hooks 'docker-open-hook):
Running: docker volume ls --format="[{{ json .Name }},{{ json .Driver }},{{ json .Name }}]"
Tramp: Opening connection docker volume ls --format="[{{ json .Name }},{{ json .Driver }},{{ json .Name }}]" for <REMOTE> using ssh...done
Running: docker network ls --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]"
Tramp: Opening connection docker network ls --format="[{{ json .ID }},{{ json .ID }},{{ json .Name }},{{ json .Driver }},{{ json .Scope }}]" for <REMOTE> using ssh...failed
Running: docker image ls --format="[[{{ json .Repository }},{{ json .Tag }},{{ json .ID }}],{{ json .Repository }},{{ json .Tag }},{{ json .ID }},{{ json .CreatedAt }},{{ json .Size }}]"
Tramp: Opening connection docker image ls --format="[[{{ json .Repository }},{{ json .Tag }},{{ json .ID }}],{{ json .Repository }},{{ json .Tag }},{{ json .ID }},{{ json .CreatedAt }},{{ json .Size }}]" for bender using ssh...done
Running: docker container ls --all --no-trunc --format="[{{ json .Names }},{{json .Names}},{{json .Status}},{{json .Image}},{{json .Command}},{{json .ID}},{{json .CreatedAt}},{{json .Ports}}]"
Tramp: Opening connection docker container ls --all --no-trunc --format="[{{ json .Names }},{{json .Names}},{{json .Status}},{{json .Image}},{{json .Command}},{{json .ID}},{{json .CreatedAt}},{{json .Ports}}]" for <REMOTE> using ssh...done
nil
Finished: docker container ls --all --no-trunc --format="[{{ json .Names }},{{json .Names}},{{json .Status}},{{json .Image}},{{json .Command}},{{json .ID}},{{json .CreatedAt}},{{json .Ports}}]"
Finished: docker image ls --format="[[{{ json .Repository }},{{ json .Tag }},{{ json .ID }}],{{ json .Repository }},{{ json .Tag }},{{ json .ID }},{{ json .CreatedAt }},{{ json .Size }}]"
Running: docker image ls --filter dangling=true --format="[[{{ json .Repository }},{{ json .Tag }},{{ json .ID }}],{{ json .Repository }},{{ json .Tag }},{{ json .ID }},{{ json .CreatedAt }},{{ json .Size }}]"
Tramp: Opening connection docker image ls --filter dangling=true --format="[[{{ json .Repository }},{{ json .Tag }},{{ json .ID }}],{{ json .Repository }},{{ json .Tag }},{{ json .ID }},{{ json .CreatedAt }},{{ json .Size }}]" for <REMOTE> using ssh...done
It hangs on opening the connection for docker network ls ... until I quit with C-g. Then it gives a failed, and runs the rest of the hooks succesfully.
For now, I'm hacking this with:
(setq docker-open-hook '())
effectively removing any updating for the docker transient description (which I personally don't look at anyways). This fixes the issue for me.
Ah interesting. My wild guess is that TRAMP is not reentrant or whatever it is called and does not support multiple tramp connections at the same time.
Can you confirm it works all the time when doing:
(setq docker-open-hook '(docker-container-update-status-async)) ?
If yes then:
(setq docker-open-hook '(docker-network-update-status-async)) ?
If yes then:
(setq docker-open-hook '(docker-container-update-status-async docker-image-update-status-async)) ?
My wild guess is that it'll work for 1, maybe 2 items but then fail at 3/4.
I think there's the same behavior if you select multiple running containers and open a shell inside them.
One easy fix would be to make things sequential when TRAMP is used but I'm not sure how.
My wild guess is that it'll work for 1, maybe 2 items but then fail at 3/4.
Looks like this is in fact the case. All three examples worked fine, but adding docker-network-update-status-async to the hook (so 3/4 of the usual functions) triggered the hang-up.
My wild guess is that TRAMP is not reentrant or whatever it is called and does not support multiple tramp connections at the same time.
This makes sense to me.
One thing I plan on doing is submitting a PR for a configuration variable for whether to update and display status in the docker transient (if this would be a welcome addition). For me, I never look at this because I usually go straight for one of the suffix commands. I expect this to speed up usage, especially on slower tramp connections where we might be creating connections for each of the docker * ls commands (which I don't even end up seeing!).
I don't see a simple way to force tramp to become sequential so I'll just close this :shrug:
Would it make sense to disable transient statuses when over tramp maybe? with your newly introduced variable this looks easy to do.
I think this issue is still worth keeping open (or at least opening a similar one with a more specific description of the problem). I still run into this issue when I'm inpatient and repeatedly hit g in the container list while waiting for a container to start, for example. This is my own fault of course, and quiting stops the hanging, but it could be confusing as to why this happens if you're not aware of the use of aio/tramp/re-entrancy issues.
Of course, I'll leave it up to you whether to close or open a new issue, as the main symptom (not being able to open the docker transient at all) I had is already addressed, but I think the problem still persists.
Would it make sense to disable transient statuses when over tramp maybe?
Sure, adding a (file-remote-p default-directory) or similar should work as a sensible check along with the user variable.