Sometimes no response from other servers to fetchSockets or serverSideEmit
My team has a multi-server deployment, and we use the Redis Streams Adapter with Valkey (but have previously used and seen this issue on Redis as well) for our sockets communication across those servers.
We occasionally get reports from users that they are not seeing the messages of other users. When looking into one of such reports, I noticed that the two members in the chat were each connected successfully to sockets, but to two different servers, which would indicate a potential problem with the streams adapter communication between the two servers.
I am able to reliably reproduce the issue by restarting the Valkey instance - i.e. restart the Valkey/Redis instance, observe that any fetchSockets or serverSideEmit does not receive any responses. The issue, however, definitely occurs more often than just a restart of the Valkey/Redis instance, I just unfortunately have not figured out exactly the conditions during which it happens.
The most odd of it all, though, is that though the socket servers won't respond, they are still sending a heartbeat into the stream.
In the image below, I had set up a separate process with an interval that would simply send a ping (via serverSideEmit) into the stream for my socket servers to respond to. The servers were at first not responding, so I restarted them, thus "resetting" the streams adapter connections.
@lexagon3 are you able to solve this issue? I am facing the similar problem, but with the Redis.
@aishwaryawambule my team has not been able to resolve the issue. We were hoping we could revert to the Redis Pub/Sub adapter, which is unfortunate because it does not have connection state recovery.. Is the Redis Pub/Sub adapter the adapter you are seeing issues with?
no, i am using redis stream adapter and facing this issue. Same as your case, i cannot go to redis pub/sub adapter due to connection state recovery absence. Do you have any plans or idea on how you will try to solve it?
This sounds like a serious issue, any thoughts from maintainers? Ping @darrachequesne
I'm planning a migration from redis-adapter to redis-streams-adapter and this issue worries me... I realise this library has not reached 1.0.0, but it is considered production-ready and supported/actively maintained by the Socket.io core team, right?
@fjeldstad yes it is. I'm looking into this.
Unfortunately, I wasn't able to reproduce the issue: https://github.com/socketio/socket.io-fiddle/tree/redis-streams-adapter
I simply tested with two clients, one connected on server1 (port 3000) and the other on server 2 (port 3001).
Server log:
server listening at http://localhost:3000
connect E3ymkBhBjmRXK51zAAAB
# of connected sockets: cluster = 2 local = 1
# of connected sockets: cluster = 2 local = 1
# of connected sockets: cluster = 2 local = 1
< Redis server is stopped >
fetchSockets error
# of connected sockets: cluster = 1 local = 1
# of connected sockets: cluster = 1 local = 1
# of connected sockets: cluster = 1 local = 1
# of connected sockets: cluster = 1 local = 1
< Redis server is restarted >
# of connected sockets: cluster = 2 local = 1
# of connected sockets: cluster = 2 local = 1
# of connected sockets: cluster = 2 local = 1
# of connected sockets: cluster = 2 local = 1
[...]
The "fetchSockets() error" is due to a missing response from the other server.
Then the other server is evicted after 10 seconds, hence the "# of connected sockets: cluster = 1 local = 1" .
Tested with [email protected]. Which Redis client are you using? Which version? I looked into https://github.com/redis/node-redis/releases, but wasn't able to find something related.
Thank you so much for investigating!!
We are using Valkey 8.x, and I don't remember what version of Redis we had previously been using.
I pulled your fiddle, and also wasn't able to reproduce at all. IIRC from when I was investigating previously, I rarely was able to reproduce locally (I'm pretty sure I did at least once, though). But it was reliably reproducible on a deployed instance. I'll take a look again this weekend to see if there are any other insights I can gather that would be useful to share.