devpod icon indicating copy to clipboard operation
devpod copied to clipboard

Waiting for devpod agent to come up error + port error

Open ngocyenhmbd opened this issue 10 months ago • 1 comments

What happened? I am using DevPod version 0.6.13, I am using Azure as provider. I am using Linux Ubuntu 24.04. I could authenticate myself to use Azure, and successfully spawn an instance on Azure, and perform pulling the image in devcontainer, as well as other process. However, when it tries to connect to OpenVSCode or VSCode, initially, I could connect and could successfully download VSCode server, but then it prompts that

[12:37:11.100] [server] Server setup complete
[12:37:11.103] [server] Starting server...
[12:37:11.103] [server] Starting server with command... Command { std: "/root/.vscode-server/cli/servers/Stable-e54c774e0add60467559eb0d1e229c6452cf8447/server/bin/code-server" "--connection-token=remotessh" "--accept-server-license-terms" "--start-server" "--enable-remote-auto-shutdown" "--socket-path=/tmp/code-82dd35e9-8fd4-4437-b294-7a04565d9951", kill_on_drop: false }
[12:37:11.205] [server] *
[12:37:11.207] [server] * Visual Studio Code Server
[12:37:11.207] [server] *
[12:37:11.208] [server] * By using the software, you agree to
[12:37:11.208] [server] * the Visual Studio Code Server License Terms (https://aka.ms/vscode-server-license) and
[12:37:11.208] [server] * the Microsoft Privacy Statement (https://privacy.microsoft.com/en-US/privacystatement).
[12:37:11.209] [server] *
[12:37:11.210] [server] Server bound to /tmp/code-82dd35e9-8fd4-4437-b294-7a04565d9951
[12:37:11.210] [server] Extension host agent listening on /tmp/code-82dd35e9-8fd4-4437-b294-7a04565d9951
[12:37:11.211] [server] parsed location: "/tmp/code-82dd35e9-8fd4-4437-b294-7a04565d9951"
[12:37:11.211] [server] 
[12:37:11.211] [server] Server started
[12:37:11.287] [server] [04:37:11] 
[12:37:11.288] [server] 
[12:37:11.288] [server] 
[12:37:11.288] [server] 
[12:37:11.358] [server] [04:37:11] Extension host agent started.
[12:37:11.409] [server] [04:37:11] Started initializing default profile extensions in extensions installation folder. file:///root/.vscode-server/extensions
[12:37:11.486] [server] [04:37:11] [<unknown>][7951b02f][ManagementConnection] New connection established.
[12:37:11.527] [server] [04:37:11] Completed initializing default profile extensions in extensions installation folder. file:///root/.vscode-server/extensions
[12:37:11.559] [server] [04:37:11] [<unknown>][c54c2799][ExtensionHostConnection] New connection established.
[12:37:11.610] [server] [04:37:11] [<unknown>][c54c2799][ExtensionHostConnection] <254> Launched Extension Host Process.
[12:37:13.865] Opening exec server for ssh-remote+linux.devpod
[12:37:14.332] Verified and reusing cached exec server for ssh-remote+linux.devpod
[12:37:14.333] No hints found in the recent session.
[12:37:14.904] "Update SSH_AUTH_SOCK" terminal command done
[12:37:43.897] Exec server for ssh-remote+linux.devpod closed (gracefully)
[12:37:43.898] > local-server-1> ssh child died, shutting down
[12:37:43.903] ------




[12:37:43.903] SSH Resolver called for "ssh-remote+linux.devpod", attempt 2, (Reconnection)
[12:37:43.905] Local server exit: 0
[12:37:43.909] SSH Resolver called for host: linux.devpod
[12:37:43.909] Setting up SSH remote "linux.devpod"
[12:37:43.911] Acquiring local install lock: /tmp/vscode-remote-ssh-e645d6ab-install.lock
[12:37:43.912] Looking for existing server data file at /home/yen/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-e645d6ab-e54c774e0add60467559eb0d1e229c6452cf8447-0.117.1-es/data.json
[12:37:43.912] No existing data file
[12:37:43.913] Using commit id "e54c774e0add60467559eb0d1e229c6452cf8447" and quality "stable" for server
[12:37:43.913] Extensions to install: 
[12:37:43.917] Install and start server if needed
[12:37:43.923] askpass server listening on /run/user/1000/vscode-ssh-askpass-1355eb28b097cfcf0a18a27fe179b1bc398338ca.sock
[12:37:43.924] Spawning local server with {"serverId":2,"ipcHandlePath":"/run/user/1000/vscode-ssh-askpass-27f3eab9e0c9caa8cabf6a3cdf8c58a03ea27660.sock","sshCommand":"ssh","sshArgs":["-v","-T","-D","43353","-o","ConnectTimeout=1000","linux.devpod"],"serverDataFolderName":".vscode-server","dataFilePath":"/home/yen/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-e645d6ab-e54c774e0add60467559eb0d1e229c6452cf8447-0.117.1-es/data.json"}
[12:37:43.924] Local server env: {"SSH_AUTH_SOCK":"/run/user/1000/keyring/ssh","SHELL":"/bin/bash","DISPLAY":":0","ELECTRON_RUN_AS_NODE":"1","SSH_ASKPASS":"/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/local-server/askpass.sh","VSCODE_SSH_ASKPASS_NODE":"/usr/share/code/code","VSCODE_SSH_ASKPASS_EXTRA_ARGS":"","VSCODE_SSH_ASKPASS_MAIN":"/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/askpass-main.js","VSCODE_SSH_ASKPASS_HANDLE":"/run/user/1000/vscode-ssh-askpass-1355eb28b097cfcf0a18a27fe179b1bc398338ca.sock"}
[12:37:43.931] Spawned 135928
[12:37:43.931] Using connect timeout of 1002 seconds
[12:37:44.014] > local-server-2> Running ssh connection command: ssh -v -T -D 43353 -o ConnectTimeout=1000 linux.devpod
[12:37:44.018] > local-server-2> Spawned ssh, pid=135936
[12:37:44.026] stderr> OpenSSH_9.6p1 Ubuntu-3ubuntu13.8, OpenSSL 3.0.13 30 Jan 2024
[12:37:56.013] stderr> 12:37:56 info Waiting for workspace to come up...
[12:38:10.743] stderr> 12:38:10 fatal DevPod workspace is stopped
[12:38:10.744] stderr> kex_exchange_identification: Connection closed by remote host
[12:38:10.744] stderr> Connection closed by UNKNOWN port 65535
[12:38:10.745] > local-server-2> ssh child died, shutting down
[12:38:10.752] Local server exit: 0
[12:38:10.752] Received install output: local-server-2> Running ssh connection command: ssh -v -T -D 43353 -o ConnectTimeout=1000 linux.devpod
local-server-2> Spawned ssh, pid=135936
OpenSSH_9.6p1 Ubuntu-3ubuntu13.8, OpenSSL 3.0.13 30 Jan 2024
12:37:56 info Waiting for workspace to come up...
12:38:10 fatal DevPod workspace is stopped
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
local-server-2> ssh child died, shutting down

[12:38:10.754] WARN: $PLATFORM is undefined in installation script output.  Errors may be dropped.
[12:38:10.754] Failed to parse remote port from server output
[12:38:10.756] Resolver error: Error: 
	at y.Create (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:739050)
	at t.handleInstallOutput (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:737131)
	at e (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:789804)
	at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
	at async /home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:812522
	at async t.withShowDetailsEvent (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:816213)
	at async /home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:786516
	at async T (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:784571)
	at async t.resolveWithLocalServer (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:786068)
	at async R (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:809398)
	at async t.resolve (/home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:813614)
	at async /home/yen/.vscode/extensions/ms-vscode-remote.remote-ssh-0.117.1/out/extension.js:2:1099250
[12:38:10.760] ------
``

When i do `devpod logs --debug`, i see these logs:
```bash
$ devpod logs --debug
11:44:52 debug execute inject script
11:44:52 debug download agent from https://github.com/loft-sh/devpod/releases/download/v0.6.13
11:44:52 debug Run command provider command: "${AZURE_PROVIDER}" command
11:45:12 debug done inject
11:45:12 debug done injecting
11:45:12 info Waiting for devpod agent to come up...
11:45:12 debug Inject Error: context deadline exceeded
11:45:12 debug done exec
11:45:15 debug execute inject script
11:45:15 debug download agent from https://github.com/loft-sh/devpod/releases/download/v0.6.13
11:45:15 debug Run command provider command: "${AZURE_PROVIDER}" command
11:45:35 debug done inject
11:45:35 debug done injecting
11:45:35 info Waiting for devpod agent to come up...
11:45:35 debug Inject Error: context deadline exceeded
11:45:35 debug done exec
11:45:38 debug execute inject script
11:45:38 debug download agent from https://github.com/loft-sh/devpod/releases/download/v0.6.13
11:45:38 debug Run command provider command: "${AZURE_PROVIDER}" command 

What did you expect to happen instead?
It should connect successfully to VSCode and OpenVSCode, given everything is authorized.

How can we reproduce the bug? (as minimally and precisely as possible)
Create a new Linux Ubuntu machine.

I face the same issue with example try-vscode-python-example and my custom devcontainer. My devcontainer.json:

{
  "name": "Python DevContainer Azure",
  "image": "my-custom-image",
  "runArgs": [
    "--privileged"
  ],
  "settings": {
    "python.pythonPath": "/usr/local/bin/python",
    "python.linting.enabled": true,
    "python.formatting.provider": "black"
  },
  "remoteUser": "root",
  "workspaceFolder": "/workspace",
  "extensions": [
    "ms-azuretools.vscode-docker",
    "ms-toolsai.jupyter",
    "ms-toolsai.jupyter-keymap",
    "ms-toolsai.jupyter-renderers",
    "ms-python.black-formatter",
    "ms-python.python",
    "ms-python.debugpy",
    "ms-python.vscode-pylance",
    "kevinrose.vsc-python-indent",
    "esbenp.prettier-vscode",
    "leanprover.lean4",
    "ruoz.coq",
    "maximedenes.vscoq",
    "ms-vscode-remote.remote-ssh",
    "ocamllabs.ocaml-platform"
  ]
}

Local Environment:

  • DevPod Version: 0.6.13
  • Operating System: linux
  • ARCH of the OS: AMD64

DevPod Provider:

  • Cloud Provider: azure
  • Kubernetes Provider: None
  • Local/remote provider: ssh
  • Custom provider: I use the default one but i am wondering if this actually effect

Anything else we need to know?

  1. When I use Docker as provider, it works, so in my hypothesis, maybe it is something related to Azure + VSCode.
  2. I even get the error using example repo, so I suppose it is not about devcontainer.json
  3. I do some other scanning, and I saw this. Error: kex_exchange_identification: Connection closed by remote host. The Azure-hosted DevPod workspace terminated the SSH connection abruptly Connection closed by UNKNOWN port 65535Port 65535 is often used for internal SSH reverse tunneling (ephemeral range ends here). fatal DevPod workspace is stopped: Confirms the workspace is crashing or failing to start fully.

It seems like port 22 is configured correctly, because it could SSH over. But it seems like it could do connection for other ports. When I check NSG on Azure, it has these configuration.

Image

So as I understand, VS Code needs more ports open for reverse tunneling and other communications. The previous advice was to add rules for ports 49152-65535 and 443. But in their current config, those rules aren't present. So the NSG is still blocking the necessary ports. This code is handle in https://github.com/loft-sh/devpod-provider-azure/blob/main/pkg/azure/create.go#L89. This is one suspect. But when I manually change the port to add more then reconnect, it seems like still does not work for me.

These are what I added.

Add these rules above the DenyAllInBound (priority 65500):

 

Priority Name Port Protocol Source Destination Action
110 vscode_ephemeral 49152-65535 TCP 0.0.0.0/0 0.0.0.0/0 Allow
120 vscode_http 80,443 TCP 0.0.0.0/0 0.0.0.0/0 Allow

Ensure outbound traffic for HTTP/HTTPS and ephemeral ports:

 

Priority Name Port Protocol Source Destination Action
110 outbound_http 80,443 TCP 0.0.0.0/0 Internet Allow
120 outbound_ephemeral 49152-65535 TCP 0.0.0.0/0 Internet Allow

I saw this issue in #848 and fixed in V0.5.0, but when I reverse to that version, the problem persists.

I also found this timeout implementation

if time.Since(lastMessage) > time.Second*5 {
  log.Infof("Waiting for devpod agent to come up...")
    lastMessage = time.Now()
}

so it means it will time out after 5 seconds then try again and again, then in case it connects more slowly, it keeps looping again until fail. This is my another suspect.

For more information, i could launch vscode-remote-try-node, but could not be sucessful with vscode-remote-try-python.

ngocyenhmbd avatar Feb 24 '25 05:02 ngocyenhmbd

Hi @ngocyenhmbd, can you confirm if your azure VM had internet access? Looking at the logs it seems to be failing to download the agent from https://github.com/loft-sh/devpod/releases/download/v0.6.13, so I wonder if the VM did not have outside internet access

bkneis avatar Mar 03 '25 09:03 bkneis

This issue is stale because it has been open for 60 days with no activity.

github-actions[bot] avatar May 03 '25 02:05 github-actions[bot]

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] avatar Jun 02 '25 02:06 github-actions[bot]