libvncserver webSocketsHandshake is wrongly detected over slow connection

Describe the bug

#define WEBSOCKETS_CLIENT_CONNECT_WAIT_MS 100

...

ret = rfbPeekExactTimeout(cl, bbuf, 4,
                               WEBSOCKETS_CLIENT_CONNECT_WAIT_MS);
if ((ret < 0) && (errno == ETIMEDOUT)) {
  rfbLog("Normal socket connection\n");
  return TRUE;
} else if (ret <= 0) {
  rfbErr("webSocketsHandshake: unknown connection error\n");
  return FALSE;
}

from https://github.com/LibVNC/libvncserver/blob/4d65ec4598883190accf858da191f03725034a50/libvncserver/websockets.c#L129

To Reproduce delay socket connection by 101ms --> connection is identified as a websocket connection.

Expected Behavior socket connection is not detected as websocket connection even if the link is slow

Aug 05 '20 15:08 matti

Thanks! Do you maybe actually mean "connect with ws/wss, delay by 101ms -> connection identified as normal if though it's not?"

Aug 05 '20 19:08 bk138

No, the opposite. Every time my normal socket client fails to send "RFB ..." in 100ms, it is detected as a wesocket client

On 5. Aug 2020, at 22.27, Christian Beier [email protected] wrote:

Thanks! Do you maybe actually mean "connect with ws/wss, delay by 101ms -> connection identified as normal if though it's not?"

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Aug 06 '20 05:08 matti

I have a proxy which detects x11vnc server sent RFB 003.008\n and I add sleep(1) before the client sends its reply of RFB 003.008\n back, then x11vnc (libvncserver) thinks that the client is websockets.

Aug 06 '20 11:08 matti

I you could share some proof-of-concept code, that'd be nice! Thanks!

Aug 06 '20 13:08 bk138

Also, it would help which code path is actually taken, a backtrace or printf-debugging is sufficient.

Aug 06 '20 13:08 bk138

can you give your email and I'll share a screencast? that would be easiest for me

Aug 06 '20 17:08 matti

A log file, maybe annotated in between with what happened, would be most helpful. You can post that here :-)

Aug 07 '20 12:08 bk138

I don't understand how a logfile that says webSocketsHandshake failed from x11vnc helps here?

On 7. Aug 2020, at 15.43, Christian Beier [email protected] wrote:

A log file, maybe annotated in between with what happened, would be most helpful. You can post that here :-)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Aug 07 '20 16:08 matti

I believe you that you encounter this bug, so no need for screencast :-) Open question: How would I be able to reproduce this reliably?

Aug 08 '20 09:08 bk138

Add sleep(1) between

client connects server accepts server sends RFB 3......\n sleep(1) client replies with RFB 3.....\n

because of the line of code I sent in the first post where there are timeouts to detect websockets, the client is detected as websockets and it fails becauee client is socket.

On 8. Aug 2020, at 12.02, Christian Beier [email protected] wrote:

I believe you that you encounter this bug, so no need for screencast :-) Open question: How would I be able to reproduce this reliably?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Aug 08 '20 09:08 matti

Thanks! I reckon you have the sleep on the client side?

Aug 09 '20 06:08 bk138

Yes on client (or more specifically in the proxy) and to recap: the sleep is just repro this "slow client" problem

On 9. Aug 2020, at 9.51, Christian Beier [email protected] wrote:

Thanks! I reckon you have the sleep on the client side?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Aug 09 '20 07:08 matti

this still happens

Dec 11 '20 11:12 matti

like isn't this a very straight forward bug? the handshake expects something to come in within 100ms or it changes protocol - on a slow network this always causes protocol change

Dec 11 '20 18:12 matti

It's very strait forward as I see it. The problem is that for normal RFB the server speaks first - for websocket the client speaks first.

Basically a working solution would be (if you know for sure the type of connection) to have a setting on the server to specify the expected connection type. AUTO or WS_WSS.

For AUTO the timeout should be kept low (maybe 100ms is too low in the first place), but for WS_WSS the timeout should be very long 10000ms or similar.

What do you think of a solution like that ?

Dec 16 '20 10:12 sgh

I have added pull request #455 - please comment.

Dec 16 '20 14:12 sgh

Will have a look ASAP, but I'm currently very busy with $$$-work, so expect some delay. Maybe during the holidays...

Dec 16 '20 14:12 bk138

I'm not sure if changing the delay from 100 to 500 is a good change.

Can't the server see otherwise that the connection is websockets, like read headers or something? This kind of timing detection is super fragile.

I made a tcp proxy wrapper to fool server (x11vnc) to detect client correctly: https://github.com/matti/x11vncfixer/blob/master/main.go#L47

It accepts the connection, then dials upstream (x11vnc), and sends a pre-set client protocol version to the server immediately when server sends the protocol version.

Dec 16 '20 14:12 matti

I believe the sequence is like this: For Websocket (ws/wss) - the client initiates the connection "GET /websockify ...... blah blah blah". The server sees the handshake and runs webSocketsHandshake to do the handshake.

For normal RFB - the server initiates the connection "RFB .....". The client sees the signature and respons with it's own signature.

The problem is that the server "breaks" if it receives a HTTP request. Also the Websocket client breaks if is receives a "RFB ...." So as I see it the only way the server can know the client type is to keep quiet until it knows for sure that the client is not a websocket client. Any tx on the socket will confuse a websocket client.

When we get to webSocketsHandshake we know for sure that it is not a RFB connection. So unless I am missing something we have to do the websocket handshake and therefore maxClientWait or rfbMaxClientWait should be ok.

That is my thoughts - I will happily add the them to the commit to clear things up.

Dec 17 '20 08:12 sgh

But can't we "pre-read" the connection and see if there is websocket stuff coming or if it's silent as an additional hint?

Dec 17 '20 13:12 matti

It is possible to peek a socket. But it still applies that if we conclude that the client is not a websocket client and we transmit "RFB ..." the client will disconnect. You can state it like this.

A RFB client will be quiet "forever" until the server sends it's signature. A websocket client will send it's HTTP websocket handshake immediately.

So - before the server "knows for sure" the connect procedure cannot continue. And we have no other option than to wait because the client type is auto detected. A much better design with be to specify the actual type of client - or have two ports open - one for wss/ws and one for normal RFB.

Dec 18 '20 06:12 sgh

A much better design with be to specify the actual type of client - or have two ports open - one for wss/ws and one for normal RFB.

Indeed. That's what probably will be done.

Dec 18 '20 08:12 bk138

Do you have any feedback in my pull request? since it now seems that we agree that specifying handshake type on the server is ok. I'm thinging that mayby 'expect_ws_handshake' should be changed to 'handshake_type' which can then be either RFB_HANDSHAKE_AUTO (default) or RFB_HANDSHAKE_WEBSOCKET

For RFB_HANDSHAKE_AUTO the current behavior will be preserved (increasing the default timeout to fx. 500) For RFB_HANDSHAKE_WEBSOCKET the long 20sec timeout will be used because we expect the client to talk first.

Does that sound alright ?

Dec 18 '20 08:12 sgh

FYI - I have updated my pull request.

Dec 21 '20 06:12 sgh