graphql-redis-subscriptions icon indicating copy to clipboard operation
graphql-redis-subscriptions copied to clipboard

Race condition with simultaneous subscribe + quick unsubscribe

Open mlaflamm opened this issue 1 year ago • 8 comments

We ran into a tricky race condition when RedisPubSub is handling multiple simultaneous subscribe calls on the same channel. We have a client that sometimes initiates multiple subscribe and quickly unsubscribes to them to keep only the last subscription active (thank you react hook!). This led us to discover the race condition I am reporting here.

In the problematic scenario, no messages are propagated to the remaining subscription and an error There is no subscription of id "*" is raised when the client finally unsubscribes from that remaining subscription. That error sometimes crashes our server. I suspect the crash is related to error handling (or lack thereof) in graphql-subscriptions or gqphql-ws modules as it happens only when the websocket is closed abruptly by the client (e.g. closing the browser).

After intensive debugging sessions, I figured out the exact flow causing the race condition. Here is my attempt to summarize it.

  • Subscribe 0 No active sub. Initialize subsRefsMap, a map of Set tracking sub ids by channel. Capture a reference to the Set and return a new promise that initiates a new redis subscribe call.
  • Subscribe 1 Subscribe 0 redis call is not completed yet. Capture a reference to the same Set of sub ids and return a new promise that initiates a new redis subscribe call.
  • Subscribe 0 completes (redis callback) Add sub id to the Set and resolve the promise. The Set only contains sub id 0.
  • Unsubscribe 0 Lookup Set from subsRefsMap for the channel. The Set contains only sub id 0. The code assumes this is the last active subscription. Fire-and-forget unsubscribe call to redis and remove the Set from subsRefsMap.
  • Subscribe 1 completes (redis callback) Add sub id to the Set and resolve the promise. The Set is not referenced by subsRefsMap anymore.
  • OnMessage Channel messages are not delivered to sub 1 because the sub id is not referenced by subsRefsMap anymore. Unsubscribe 0 is likely to also prevent message delivery.
  • Unsubscribe 1 An error There is no subscription of id "1" is raised if no other subscriptions are active on the channel.

I'll submit a PR with a fix.

mlaflamm avatar Mar 20 '23 13:03 mlaflamm

We're seeing this sometimes as well. I've not been successful in reproducing it, so thanks for doing all the hard work.

jstaro avatar Mar 27 '23 08:03 jstaro

Also seeing this. Its kinda bad that its impossible to fix this from the app perspective, as it happens deeply in the library which results in an uncaught exception and thus a server restart.

Meemaw avatar Apr 04 '23 06:04 Meemaw

Following here 🙏

jalagar avatar Apr 05 '23 15:04 jalagar

@davidyaha Would be great if the PR by @mlaflamm could get merged into a new release 🙏🏽

pantajoe avatar May 04 '23 07:05 pantajoe

Could we have an update on this issue? @mlaflamm did you manage to craft a PR? If not do you want to share your thoughts on how would you fix it to see if I can pick it up?

trixobird avatar Jun 28 '23 09:06 trixobird

@trixobird There is an open PR (linked to in this issue). FWIW, we've been running this PR in production and it solved our issue.

jstaro avatar Jun 29 '23 07:06 jstaro

we also encountered this error in production. unlclear if this is the same use case, but would love to see this PR merged.

chkp-talron avatar Jul 13 '23 07:07 chkp-talron

having the same problem

BrainEno avatar Nov 08 '23 09:11 BrainEno

Fixed with #599

davidyaha avatar May 02 '24 07:05 davidyaha