socket.io
socket.io copied to clipboard
Memory leak - adapter.rooms.size getting bigger and bigger
Describe the bug Room size is getting bigger while there are a lot of connections. I think problem is more obvious, when server is not able to handle connections - if there is a lot of connections in the same time
Second issue is with disconnect event - not called every time when connection is closed/killed (don't know if it's bug)
To Reproduce Just copy&paste&run the code under.
Socket.IO server version: 4.1.3
(also tried on 2.2.0
)
Server
import io from 'socket.io';
import express from 'express';
const app = express();
const server = app.listen(3000);
io(server, {});
userNamespace = socketIo.of('/');
let disconnectCount = 0;
userNamespace.on('connection', async (socket) => {
const roomId = Math.round(Math.random() * 10000000000);
socket.join(`Room - ${roomId}`);
socket.on('disconnect', () => { // SECOND BUG - not called every time when connection is closed
disconnectCount++;
//console.log('Room - ${roomId} - leave');
});
});
setInterval(() => {
console.log(`Room size ${userNamespace.adapter.rooms.size}, disconnect count ${disconnectCount}`);
}, 5000);
WS client
Client for server v.4.1.3
npm install -g artillery@latest
npm install artillery-engine-socketio-v3
400 connections (rooms)
test.yaml
config:
target: "http://localhost:3000/"
phases:
- duration: 10
arrivalRate: 40
engines:
socketio-v3: {}
scenarios:
- name: "A user that just lurks"
weight: 90
engine: "socketio-v3"
flow:
- emit: ["add user"]
- think: 1
Run client
artillery run test.yaml
Important is return codes - 395x of "0" which is OK. 5x unable to connect. Client output should be something like this:
(output only for - FIRST RUN)
Summary report @ 15:31:25(+0200) 2021-08-30
Scenarios launched: 400
Scenarios completed: 395
Requests completed: 395
Mean response/sec: 31.63
Response time (msec):
min: 0.2
max: 3.8
median: 0.3
p95: 0.6
p99: 1.6
Scenario counts:
A user that just lurks: 400 (100%)
Codes:
0: 395
Errors:
Error: xhr poll error: 5
// FIRST RUN (0 >> 38) - should be zero at the end
Codes:
0: 395
Errors:
Error: xhr poll error: 5
Room size 0
Room size 6
Room size 76
Room size 178
Room size 32
Room size 74
Room size 74
Room size 74
Room size 74
Room size 46
Room size 38
// SECOND RUN (38 >> 76)
Codes:
0: 400
Errors:
-
Room size 38
Room size 96
Room size 140
Room size 123
Room size 76
// THIRD RUN (74 >> 292)
Codes:
0: 379
Errors:
Error: xhr poll error: 21
Room size 74
Room size 98
Room size 156
Room size 264
Room size 217
Room size 292
Room size 292
Room size 292
Expected behavior Room size should be 0 at the end of tests, after close all connections.
Platform:
- Device: Chrome
- OS: Windows
- Socket.io 4.1.3 (also tried on 2.2.0)
also here - #4028 via #2427
I think what you are seeing is clients that establish an HTTP long-polling connection, and then get killed. The server should properly remove them when the heartbeat fails.
The room size should return to 0 after a delay (by default, 45 seconds, which is pingInterval
+ pingTimeout
).
Documentation: https://socket.io/docs/v4/server-initialization/#pingTimeout
You can also try be manually creating the clients: https://socket.io/docs/v4/load-testing/#Manual-client-creation
I agree, but it's not happening, even after 10 minutes the room size is still the same.. (default settings - socket.io 4.2.0) Those two maps:
userNamespace.adapter.rooms.size
userNamespace.adapter.sids.size
are same size.
It works perfectly normal when there is no pressure on the server (100 conn), for example with setting:
- duration: 10
arrivalRate: 10
the output is:
Room size 0, disconnect count 100
but with more connections (400)
- duration: 10
arrivalRate: 40
the output is (from artillery):
Errors:
Error: xhr poll error: 19
Error: xhr post error: 14
engine.socketio.emit: 367
So there should be Room size 0, disconnect count 367, but the output is (after few minutes):
Room size 84, disconnect count 131
I want to add a question related to disconnection and memory leaks. Socket.io removes the listener after detecting the disconnection? Or need to manually call events.forEach(eventName => socket.removeAllListeners(eventName));
when disconnect?
@matiaslopezd the socket
object should be properly GC'ed, so there's no need to manually call removeAllListeners()
You can test it by creating a lot of clients and check the heapdump after disconnection: https://socket.io/docs/v4/load-testing/
@BlackBacker apart from getting a warning from artillery, I can't reproduce:
Warning:
CPU usage of Artillery seems to be very high (pids: 24746)
which may severely affect its performance.
See https://artillery.io/docs/faq/#high-cpu-warnings for details.
All virtual users finished
Summary report @ 07:59:56(+0200) 2021-09-07
Scenarios launched: 4000
Scenarios completed: 4000
Requests completed: 4000
Mean response/sec: NaN
Response time (msec):
min: 0
max: 2.9
median: 0.1
p95: 0.2
p99: 0.2
Scenario counts:
A user that just lurks: 4000 (100%)
Codes:
0: 4000
engine.socketio.emit: 4000 # with an arrivalRate of 400
In your case, maybe artillery is spawning some kind of zombie clients? Could you please check the size of the io.of("/").sockets
Map?
My application is proprietary but I can confirm this is happening also. Over 12 hours I'm logging 27k connections and 24k disconnections. It's eventually causing my application to crash
@JustAnotherSoftwareDeveloper could you please provide a heap dump? Or anything that could help us find the culprit?