cloudflared icon indicating copy to clipboard operation
cloudflared copied to clipboard

🐛 2024.10.0 appears to have broken xterm.js (e.g. Proxmox WebUI) when served over CF tunnel

Open itsthejb opened this issue 1 year ago • 8 comments

Describe the bug

Prior to 2024.10.0 it was possible to serve Proxmox web UI over a CF tunnel. However since 2024.10.0 the default xterm.js CT/VM web console is failing with Connection failed (Error 501: ). The noVNC implementation continues to work without issues.

To Reproduce Steps to reproduce the behavior:

  • Create tunnel pointing to Proxmox web UI at port 8006
  • Serve tunnel using client version 2024.10.0
  • Attempt to connect to a LXC/VM console using the >_ Shell button
  • Fails w/ Connection failed (Error 501: )

Expected behavior

  • Possible to connect to the console as prior to 2024.10.0

Environment and versions

  • OS: Proxmox 8.2 (latest) / Debian Bookworm
  • Architecture: x86_64
  • Version: 2024.10.0

Logs and errors

  • I don't currently see any clear error log entry from journalctl for cloudflared when the failure happens

Additional context

  • ~~Would like to test a downgrade to 2024.9.1 to verify the problem exactly, but not sure how this is possible~~ Verified that the issue was introduced after 2024.9.1 by restoring to a backup w/ 2024.9.1, where everything works as before
  • In my case, cloudflared installed w/ apt in a Debian Bookworm LXC

itsthejb avatar Oct 15 '24 09:10 itsthejb

I confirm. I have the same problem. However, it appeared for me about two or three weeks ago. There are also reports that the problem also appeared at the turn of May and June: https://forum.proxmox.com/threads/error-connection-error-501-not-implemented.11767/post-95568

On my OpenWRT router I have Cloudflare version: 2024.4.1-2 and it causes the above problem The agent on Linux version 2024.9.1 is correct, but in version: 2024.10.0 is not correct.

OdoMarTus avatar Oct 15 '24 10:10 OdoMarTus

Confirmed that, at least in my case, the issue only began after 2024.9.1. I reverted to a Proxmox CT backup that had [email protected] and I'm able to work around the issue introduced in 2024.10.0. This forces me to hold the package, however...

itsthejb avatar Oct 15 '24 12:10 itsthejb

I can also confirm. I did not run into this issue until upgrading to 2024.10.0 recently.

q0Jop avatar Oct 15 '24 17:10 q0Jop

I confirm this issue as well.

kmendell avatar Oct 17 '24 15:10 kmendell

I think it is probably this commit: https://github.com/cloudflare/cloudflared/commit/e2064c820f32802f58baa027903043fe0ed052e0#diff-6ff9dc179d3cf0539f514218792ce1d93001e2b4835a695bde442ad45edf01deR516 @chungthuang

It looks like it used to be that if it is not a websocket (false), then req.Body is left as it is (not set to http.NoBody).

Now it is checking if RequestBodyHintEmpty.String() exists and if it does returns true, which sets req.Body to http.NoBody, which could be Websocket incompatible. Websockets would have empty bodies, so if the metadata is being applied, then this would now be configuring a NoBody, when before it didn't (I assume websockets were excluded from the logic for a reason).

I suspect this has wider implications on websockets more broadly beyond Proxmox.

I think it would be worth adding a websocket to the tests too. At the moment the tests in that commit only covers HTTP so things like this could get missed in the current test coverage.

maggie44 avatar Oct 17 '24 22:10 maggie44

Could someone try this PR out to confirm if it fixes the issue?

https://github.com/cloudflare/cloudflared/pull/1340

Clone it, run make (assuming you have Golang and make installed) and it will build a cloudflared binary. Run that built binary instead of the original and see if it works.

maggie44 avatar Oct 18 '24 08:10 maggie44

@maggie44 I'll try this out later

itsthejb avatar Oct 18 '24 08:10 itsthejb

I bisected the issue (https://github.com/cloudflare/cloudflared/pull/1340#issuecomment-2422660954) and found the breaking commit to be e2064c820f32802f58baa027903043fe0ed052e0

itsthejb avatar Oct 18 '24 14:10 itsthejb

ISSUE #1337 🔥 🤑

aidendotgg avatar Oct 21 '24 06:10 aidendotgg

Haha yes I also noticed this 🤣

Note there is now a fix for this ready to land hopefully soon

itsthejb avatar Oct 21 '24 07:10 itsthejb

Hey everyone! Thank you for letting us know of this problem and for the effort in identifying the cause of the issue. We decided to fully revert the change and will take some time to evaluate a better approach. Please let us know if you still see this issue after upgrading to 2024.10.1.

GoncaloGarcia avatar Oct 24 '24 10:10 GoncaloGarcia

Can confirm things are working as expected in 2024.10.1

itsthejb avatar Oct 24 '24 10:10 itsthejb

@GoncaloGarcia In spite of reverting the change, installing the new version it really feels like the interface is less reliable than previous. While the console loads, it frequently disconnects and I have to refresh the page. Do you think this could still be related to other changes in 2024.10.0?

Reopening the issue for the time being for visibility

itsthejb avatar Oct 26 '24 14:10 itsthejb

Fixed in the latest version 2024.10.1

OdoMarTus avatar Oct 28 '24 14:10 OdoMarTus

@itsthejb what do you mean by interface? Proxmox interface?

The code in 2024.10.0 had very minimal changes and the only one that could affect traffic was the one we reverted. so it should not be related. Please open a new issue with more details if you can reproduce it.

joliveirinha avatar Oct 29 '24 09:10 joliveirinha

@joliveirinha Yes, it's not the original issue, which is definitely fixed. I just noticed the web console stalling and losing the connection but differently. I could indeed be some different, so I just wanted to ask in case you thought there might be something else.

Sure, I'll leave the issue closed. I suspect it is something else, just wanted to ask why the topic is still warm.

Cheers

itsthejb avatar Oct 29 '24 09:10 itsthejb

FYI the disconnection error is:

Screenshot 2024-10-29 at 11 38 08

itsthejb avatar Oct 29 '24 11:10 itsthejb

can confirm that with 2024.10.1 the issue is not present anymore.

clemone210 avatar Nov 04 '24 17:11 clemone210