stream-chat-js icon indicating copy to clipboard operation
stream-chat-js copied to clipboard

Race condition: Closing a connection can lead to hanging promises as methods continue to await a web socket which has already been closed

Open mfbx9da4 opened this issue 2 years ago • 1 comments

In multiple methods ie client.search(), client.queryChannels() and client.queryUsers(), the client waits for the web socket connection like so

await this.wsPromise

If await this.wsPromise is awaiting and some other code closes the connection at the right time, the client method, eg client.queryChannels, will hang forever. The promise neither resolves nor rejects. Even when a new connection is opened it doesn't resolve the original promise because a new promise is created and assigned to this.wsPromise. This hanging promise is especially problematic for us, as calls to queryChannels are in a queue and if one call doesn't resolve, effectively the whole app is in a bad state, and the list of channels won't update until the user force quits the app.

Reproduction

I have created a minimal reproduction demo to demonstrate the problem in action. The demo also includes a proposed fix.

To manually reproduce the bug, these steps must occur:

  1. One of the above listed methods, eg client.queryChannels, calls await this.wsPromise while the websocket is connecting
  2. Shortly after closeConnection() is called. For example, in our app, we call closeConnection whenever the React Native app transitions into background mode. The timing of calling closeConnection() is very important to reproduce this race condition. It must be called right after the new Websocket(url) is created and before onmessage is fired. This is more likely to occur with poor network conditions as it takes longer for the connection to be established.
  3. The await this.wsPromise from step 1 is not rejected at this point, instead it continues waiting until there is an active connection. I think this makes sense as a behaviour. The alternative would be to reject the promise. I think it's fine though to keep the promise waiting for an active connection as this will lead to the best DX/UX, IMO.
  4. A new connection is opened, eg once the app is reopened
  5. Instead of resolving the original promise a new promise is created and the client.queryChannels call is left hanging forever.

Solution

To keep the spirit of the intended original await this.wsPromise code and produce the least code changes to your codebase you can turn wsPromise into a getter on StreamChat. The getter will construct a new promise and it will resolve whenever there is a healthy websocket connection.

  get wsPromise() {
    const isConnected = (this.wsConnection?.isHealthy || this.wsFallback?.isHealthy()) && this._hasConnectionID();
    // Already connected so resolve
    if (isConnected) return Promise.resolve();

    return new Promise<void>((resolve) => {
      const { unsubscribe } = this.on('connection.changed', (event) => {
        if (event.online) {
          unsubscribe();
          resolve();
        }
      });
    });
  }

Let me know if you have any questions

mfbx9da4 avatar May 12 '23 12:05 mfbx9da4