cloudflared icon indicating copy to clipboard operation
cloudflared copied to clipboard

First websocket message dropped

Open brimworks opened this issue 4 years ago • 6 comments

We have been trying to use cloudflared to expose a nats-server websocket port. However, we are seeing an issue with the first websocket message getting dropped. Note that the first message is sent from the nats-server to the client and always begins with the word "INFO ".

I've found that this issue can be reproduce pretty reliably if you have access to two computers.

[1] ssh to a remote computer and run these commands to download a configuration and server:

curl -sSL -o nats.conf 'https://gist.githubusercontent.com/brimworks/c5f2f6f25d932fef40eb17409724f4a5/raw/1e4525417ab38df82fd371fbe20972667c5a73ae/nats.conf'
curl -sSL https://github.com/nats-io/nats-server/releases/download/v2.2.0/nats-server-v2.2.0-linux-amd64.tar.gz | tar --strip-components=1 -zxv

You may need to update the second download URL so it matches your hardware. You can view the different nats-server releases here:

https://github.com/nats-io/nats-server/releases/tag/v2.2.0

[2] Run the nats server with this configuration as such:

./nats-server -c nats.conf

[3] (optional) port-forward the remote server's port to localhost:

ssh -L 8080:192.168.1.18:8080 remote-host.example.com

[4] Run cloudflared so it proxies to your nats server running on the remote machine. For example:

./cloudflared tunnel \
    --origincert cloudflared.pem \
    --no-autoupdate \
    --url http://localhost:8080 \
    --hostname your.cloudflare.url

[5] Use websocat to test (https://github.com/vi/websocat/releases/tag/v2.0.0-alpha0) ... or you can use Chrome developer tools in order to start a connection with the hostname used by cloudflared:

websocat wss://your.cloudflare.url

If successful, the "INFO " message should be printed from the server immediately. If this bug is reproduced, then the "INFO " message is NOT printed and instead the 10 second timeout occurs and the "-ERR 'Authentication Timeout'" is printed (this authentication will always happen after 10 seconds regardless of if the "INFO " message is printed).

Note that this issue is not reliably reproduced, so it may take ~12 times before you can reproduce the issue. As mentioned above, it seems to be more reliable if the network latency between nats-server and cloudflared is higher.

brimworks avatar Mar 31 '21 21:03 brimworks

Thank you for reporting this @brimworks These steps are great, we'll look into reproducing it.

nmldiegues avatar Apr 02 '21 12:04 nmldiegues

Thanks Brian, we've reproduced this and we're working on a fix. Internal ticket is TUN-4168.

adamchalmers avatar Apr 02 '21 16:04 adamchalmers

This is caused by https://github.com/gorilla/websocket/issues/679

ipostelnik avatar Apr 02 '21 18:04 ipostelnik

After more discussion with gorilla/websocket devs, it looks like cloudflared misuses conn.UnderlyingConn. We're going to change how we proxy websocket connections.

ipostelnik avatar Apr 06 '21 15:04 ipostelnik

Hi! Release 2021.4.0 should fix this (see commit) @brimworks could you please upgrade your cloudflared and tell us if the issue is fixed?

adamchalmers avatar Apr 07 '21 21:04 adamchalmers

Hi @adamchalmers , I've been noticing the websocket connects are getting prematurely closed. It may be an issue in our end (need to investigate further), but that seems to be the only concern at this point.

brimworks avatar Apr 08 '21 15:04 brimworks