Linux Mesh Agent hangs post-TLS handshake during WebSocket upgrade (Worked Previously)
The MeshCentral agent on a specific Ubuntu 25.04 machine, which previously worked correctly, now fails to connect. The meshagent.service reports as running (or the agent can be run manually). However, the device never appears as connected in the MeshCentral server console. When run manually, the agent prints "Connecting to wss://[YOUR_MESH_SERVER_HOSTNAME]:443/agent.ashx" and then hangs indefinitely. Interrupting with Ctrl+C causes it to print "Connected." and then exit, but this does not reflect a true, stable connection.
Though, other devices on the same local network (including other Linux devices) connect successfully to the same MeshCentral server. Furthermore, this specific agent on this machine used to work without issue and then suddenly stopped connecting, despite no known manual configuration changes to the client machine or the agent installation prior to the issue arising.
Something of thing of note is that occasionally, with no reliable reproducibility, stopping the service leads to the device showing up on meshcentral, but simply offline. It has no other information gathered, and the logs simply say, "Added device [name] to device group [x]". Another symptom is that I am unable to stop the mesh central service gracefully (only by killing it and disabling, etc), but I believe this is due to the agent being stuck connecting to the WSS URI and not accepting a call to exit.
I am unsure as to how to reproduce this error, as any subsequent attempts on my part are not fruitful other than the problematic machine.
Other Info
-
DNS Resolution:
ping [YOUR_MESH_SERVER_HOSTNAME]resolves correctly.telnet [YOUR_MESH_SERVER_HOSTNAME] 443connects successfully (TCP layer OK).
-
TLS Handshake:
openssl s_client -connect [YOUR_MESH_SERVER_HOSTNAME]:443 -servername [YOUR_MESH_SERVER_HOSTNAME]completes successfully withVerify return code: 0 (ok).- Server certificate is valid (Let's Encrypt) and trusted by the client system.
ca-certificatespackage is up-to-date. System time is correct.
-
Local Firewall (
ufw): Inactive. -
VPN/Proxy: Issue persists identically with VPN/proxy software completely disabled. VPN is not the cause.
-
Agent Reinstallation: Multiple forceful removals (service files, all known agent directories:
/opt/meshagent/,/usr/local/mesh/,/usr/local/mesh_services/,/var/opt/meshagent/) and reinstallations using the official installer script. Issue remains. -
Agent Configuration (
.mshfile): Correctly containsMeshServer=wss://[YOUR_MESH_SERVER_HOSTNAME]:443/agent.ashx, and is nearly identical except for relevant agent information to other working mesh agents on other devices. -
Systemd Service Configuration: Runs agent as root.
StandardOutputwas initiallynull, changed tojournal, but logs then only showed the same "Connecting to..." message followed by the hang. -
strace ./meshagent(summary):- Completes extensive system/hardware information gathering via child processes.
- Agent attempts to
openat()several.jsfiles (e.g.,linux-gnome-helpers.js) resulting inENOENT; understood to be non-critical for native agent core function. - Successfully resolves server hostname.
- Prints "Connecting to..." message.
- Establishes TCP connection (non-blocking
connect()returnsEINPROGRESS, later confirmed bypselect6socket becoming writable). - Successfully completes TLS handshake (sends Client Hello, exchanges TLS records with server).
- Sends a final block of data (presumed WebSocket upgrade request / initial application data).
- Hangs at this point, likely in
pselect6()orppoll()waiting for a response on the socket. Ctrl+C(SIGINT) interrupts this wait, triggering a cleanup sequence that misleadingly prints "Connected." before exit.
-
DMI Information: Agent reads DMI info (e.g., "NO Asset Tag" for board asset tag) successfully before the network hang. I don't believe this would impact anything, but issues #141 and #272 lead me to think otherwise, or maybe that some agent information is causing issues. (except in my case it doesn't continuously do so)
Let's Encrypt
can u verify if the ssl is rsa or ecdsa? This seems to be very common at the moment but ecdsa isn't supported! The certificate must be rsa!
The certificate is rsa, as specified below:
Peer signature type: RSA-PSS
Server public key is 4096 bit
Although, the issue I am having is only affecting one specific device. I have other ubuntu machines that are connecting just fine on the same network.
is this still an issue or can it be closed?