socket.io icon indicating copy to clipboard operation
socket.io copied to clipboard

Timeout error during handshake ambiguously reported as `websocket error`; stops re-connection attempts

Open Domiii opened this issue 4 years ago • 0 comments

Describe the bug

connect_error ambigously reported as websocket error if handshake times out (e.g. due to busy client).

This bug caused me quite some headache, since I did not even consider the possibility of a timeout problem.

This is the relevant stacktrace:

connect_error: websocket error
    at logCb (code\common\src\log\logger.js:232:17)
    at logTrace (code\common\src\log\logger.js:124:5)
    at Socket.<anonymous> (code\runtime\src\client\Client.js:114:18)
    at Socket.Emitter.emit (code\node_modules\component-emitter\index.js:145:1)
    at Socket.emitReserved (code\node_modules\socket.io-client\build\typed-events.js:59:1)
    at Socket.onerror (code\node_modules\socket.io-client\build\socket.js:177:1)
    at Manager.Emitter.emit (code\node_modules\component-emitter\index.js:145:1)
    at Manager.emitReserved (code\node_modules\socket.io-client\build\typed-events.js:59:1)
    at Socket.<anonymous> (code\node_modules\socket.io-client\build\manager.js:127:1)
    at Socket.Emitter.emit (code\node_modules\component-emitter\index.js:145:1)
    at Socket.onError (code\node_modules\engine.io-client\lib\socket.js:594:1)
    at WS.Emitter.emit (code\node_modules\component-emitter\index.js:145:1)
    at WS.onError (code\node_modules\engine.io-client\lib\transport.js:32:1)
    at WebSocket.ws.onerror (code\node_modules\engine.io-client\lib\transports\websocket.js:113:1)
    at WebSocket.onError (code\node_modules\ws\lib\event-target.js:140:1)
    at WebSocket.emit (node:events:394:28)
    at null.abortHandshake (code\node_modules\ws\lib\websocket.js:731:1)
    at WebSocket.close (code\node_modules\ws\lib\websocket.js:224:1)
    at WS.doClose (code\node_modules\engine.io-client\lib\transports\websocket.js:190:1)
    at WS.close (code\node_modules\engine.io-client\lib\transport.js:57:1)
    at Socket.onClose (code\node_modules\engine.io-client\lib\socket.js:619:1)
    at null.close (code\node_modules\engine.io-client\lib\socket.js:548:1)
    at Socket.close (code\node_modules\engine.io-client\lib\socket.js:579:1)
    at Timeout.<anonymous> (code\node_modules\socket.io-client\build\manager.js:146:1)
    at listOnTimeout (node:internal/timers:557:17)
    at processTimers (node:internal/timers:500:7)

Looking at the frame third from the bottom Timeout (source here) reveals:

const timer = setTimeout(() => {
  debug("connect attempt timed out after %d", timeout);
  openSubDestroy();
  socket.close();  // this is the line that triggers the `connect_error`
  socket.emit("error", new Error("timeout"));   // this `error` never gets reported
}, timeout);

To Reproduce

Please fill the following code example:

Socket.IO server version: 4+

Server

import { Server } from "socket.io";

const io = new Server(3000, {});

io.on("connection", (socket) => {
  console.log(`connect ${socket.id}`);

  socket.on("disconnect", () => {
    console.log(`disconnect ${socket.id}`);
  });
});

Socket.IO client version: 4+

Client

WARNING: This code contains a 20s busy loop.

import { io } from "socket.io-client";

const socket = io("ws://localhost:3000/", {});

const start = Date.now()
setTimeout(() => {
  // busy loop here -> handshake was initialized but not completed within timeout
  const start = Date.now()
  const i = 0;
  while (Date.now() - start < 20 * 1000) {
    ++i;
  }
  console.log('waited', Date.now() - start);
});

socket.on("connect", () => {
  console.log(`connect ${socket.id}`);
});

socket.on("error", (err) => {
  console.trace(`error -`, err);
});

socket.on("connect_error", (err) => {
  console.trace(`connect_error -`, err);
});

socket.on("disconnect", () => {
  console.log("disconnect");
});

Expected behavior

The connect_error should convey the timeout error message, or at least the error handler should trigger with the timeout message afterwards.

As shown above, this requires fixing the timeout handler.

Platform:

  • Device: PC
  • OS: Windows 10

Workaround

A naive way of working around this bug is to set client timeout option to a very high value (e.g. 1e6 or some rough estimate of the max time your client is busy for).

Domiii avatar Aug 24 '21 07:08 Domiii