hocuspocus icon indicating copy to clipboard operation
hocuspocus copied to clipboard

Server may not receive Sync message if express-ws handler yields before calling handleConnection

Open jordanstephens opened this issue 3 years ago • 2 comments

Description

Following the Installation docs for integrating Hocuspocus server with an Express.js app, I found a race condition during the process of responding to a request and setting up the connection.

My express-ws request handler looked like this:

app.ws("/path/:id", (ws, req, next) => {
  getContext().then((context) => {
    server.handleConnection(
      ws,
      req,
      req.params.id,
      context
    );
  });
});

This resulted in intermittent failure to resolve the connection attempt. Eventually awareness messages would be exchanged updating the HocuspocusProvider's status to Connected, but the document would be empty.

Eventually, I determined that the HocuspocusProvider's Sync message was not being reliably received by the Hocuspocus server. As far as I can tell, this happens because the websocket will emit the open event to the HocuspocusProvider before our express-ws route handler gets called. This sets up a race between the provider sending the first Sync message and the server setting up listeners to receive it. Success depended on how long it took for my getContext function to run.

I noticed that handleConnection sets up a queue so it can replay messages received before the listeners are established, but there is still a gap between the open event being emitted and handleConnection being called. That gap may only appear if the route handler performs async operations and yields control back to the event loop, but it is obviously fragile to depend on avoiding a context switch like this.

In my case, I think I have worked around it by building my own queue in the route handler to replay messages received while getContext runs, but perhaps a more robust solution would be to have the server initiate the sync process instead of the Provider. Currently, the Provider is unable to know when it is safe to send the first Sync message, so perhaps the first message should be the server indicating its readiness to handle messages.

Steps to reproduce the bug Steps to reproduce the behavior:

  1. Integrate Hocuspocus server into an Express.js app following the documentation
  2. Sleep in the event loop in the express-ws route handler before calling server.handleConnection
  3. Observer Hocuspocus provider failing to load document

Expected behavior I expect Hocuspocus server to be ready to receive Sync messages when HocuspocusProvider sends them.

Environment?

  • operating system: MacOS 12.6.1
  • browser: Firefox 107 and Chrome 108
  • desktop
  • hocuspocus version: v1.0.0-beta.6

Additional context

If you're interested in receiving contributions related to this issue, let me know. I'd be happy to volunteer if you can cast light on a direction worth exploring.

jordanstephens avatar Dec 07 '22 06:12 jordanstephens

hey @jordanstephens,

thanks for this! Indeed, the current implementation may cause issues in certain scenarios. We're currently planning a bigger protocol change to make yjs subdocs and multiplexing possible, I'll see how that can also solve this issue here.

janthurau avatar Jan 11 '23 17:01 janthurau

this is listed as open and I'm seeing docs that auth but don't sync, could this be the issue?

tobowers avatar Jul 28 '24 20:07 tobowers