devpod
devpod copied to clipboard
Refresh related error when using devpod ssh
What happened?
I tried using a gitlab CI with devpod. I have a CI script that runs
devpod ssh . --command "my command"
If my command lasts more than 30 seconds I see the following error:
error Error tunneling to container: wait: remote command exited without exit status or exit signal
I noticed that it happens after a Start refresh is displayed in debug logs. Return code of the command is correct and the command has been correctly executed, but I still this error log in my logs. Note that I am unable to reproduce outside the CI.
I noticed it does not happen if the command takes less than 30 sec to exec.
What did you expect to happen instead?
No error displayed.
How can we reproduce the bug? (as minimally and precisely as possible)
Run a devpod ssh . --command "sleep 60" command on a gitlab CI runner.
Local Environment:
- DevPod Version: 0.5.20
- Operating System: linux
- ARCH of the OS: AMD64
DevPod Provider:
- Cloud Provider: docker
- Local/remote provider: docker
Anything else we need to know?
Hey @aacebedo, thanks for reporting this issue. Could you share an excerpt of your pipeline? Maybe there's something in the setup of podman that causes this
Sorry I am not using podman but docker on the CI. Locally I use podman which is working. You want the output or the pipeline itself? I can share a redacted log but not the pipeline itself.
The pipeline definition itself to reproduce the issue would be great, is there a way to remove the sensitive steps?
Here is a representative GL job exhibiting the issue
devpod_test:
image: devpod
services:
- docker:24.0.5-dind
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login $CI_REGISTRY --username $CI_REGISTRY_USER --password-stdin
script:
- devpod provider set-options docker --option "DOCKER_HOST=${DOCKER_HOST}"
- devpod ide use none
- devpod up .
- devpod ssh . --command "sleep 60"
rules:
- changes:
- .devcontainer/**
@aacebedo I'm trying to reproduce this error locally but cannot. Does this happen for you not running in a CI?
This issue is stale because it has been open for 60 days with no activity.
This issue was closed because it has been inactive for 30 days since being marked as stale.
I was able to fix that with
1- kill $(pgrep ssh-agent)
2- rm -rf /tmp/ssh-*