Possible websocket memory leak
Use websocket to send large amounts of data to Android, and the sending frequency is high. when tornado run for a long time , it will be killed by system. What can I do?
It sounds like the server wasn't actually able to send all that data to the client, so most of what you sent was queued up in the server process memory. When the process used up too much memory for the system, it was killed (by the "OOM killer"). You need a way to pause while waiting for what you've already sent to actually get out of the server process - you can do that via await (or yield) of the write_message() function, which returns a Future.
Thanks ! But I use yield , it also be killed.
@gen.coroutine
def send_message(self, message):
yield self.write_message(message, True)
The client can get all the data from tornado. The frequency is low, it also will be killed (OOM).
Are you also using yield in whatever function is calling send_message? You have to use yield at every level to get appropriate flow control.
But if the client is getting all the data, flow control is probably not your problem. It sounds like there's a memory leak somewhere. Use a heap profiler to find it.
Also try turning on periodic pings. Especially with mobile clients, you may have clients that have gone away without cleanly closing their connection, and periodic pings will help your server identify and close those connections faster.
Thanks for reply! My project is using websocket to send data , and I turn on periodic pings, then the connection won't close, that the project need. I use a memory profiler to check, I found that the send_message function of websocket which every call will use more memory, so I want to know if this is related to that function?
I can't answer that from the little information you've provided. You'll have to learn how to use your profiling tool to answer that question yourself.
I write an example that similar to my projects.
#!/usr/bin/python
# -*- coding: utf-8 -*-
'''
Created on Jun 21, 2018
@author: eaibot
'''
from tornado.websocket import WebSocketHandler
from memory_profiler import profile
from zips import ZipUtils
import yaml,json
from threading import Lock
class WsHandler(WebSocketHandler):
client_id=0
clients_connected=0
wsClients = set()
cmdWsId = ""
def __init__(self , application, request, **kwargs):
super(WsHandler, self).__init__(application, request, **kwargs)
self.zipUtils=ZipUtils()
self.handler_lock = Lock()
def __del__(self):
pass
def open(self):
cls = self.__class__
cls.wsClients.add(self)
try:
cls.client_id +=1
cls.clients_connected +=1
except Exception as exc:
print("Unable to connect , reason : %s"%exc)
print("client %d connected ..."%(cls.client_id))
def on_message(self, message):
cls = self.__class__
self.reqMsg=yaml.safe_load(message)
self.reqType=self.reqMsg.get("type")
self.doWithData(self.reqType)
@profile
def doWithData(self, type):
cls = self.__class__
count=0
while(True):
count=count+1
if cls.clients_connected!=0 and count==30:
count=0
self.send_message(type)
else:
break
def on_close(self):
cls = self.__class__
cls.wsClients.remove(self)
cls.clients_connected -= 1
cls.client_id -=1
print("client disconnected, %d clients still linked ..."%(cls.clients_connected))
def check_origin(self, origin):
return True
@profile
def send_message(self, req_type):
try:
self.returnJson = {"type":req_type, "result": "error: please start up ros service firstly."}
self.jsonStr = json.dumps(self.returnJson)
with self.handler_lock:
self.write_message(self.jsonStr, True)
except Exception as exc :
print("%s : %s"%(req_type , exc))
From memory_profiler , I found that send_message function will use more memory. I can't find out the cause, can you help me ? and in my project, sending frequency is not too high ,but the memory will be used fastly too.
I don't think that code will ever call send_message() because the only caller is:
@profile
def doWithData(self, type):
cls = self.__class__
count=0
while(True):
count=count+1
if cls.clients_connected!=0 and count==30:
count=0
self.send_message(type)
else:
break
- count = 0
- the loop starts
- count = 1
- the
ifconditioncount == 30is false - the
elsebody is run,break, which exits the loop - function ends
oh, sorry ! I upload the wrong code, the right code is
@profile
def doWithData(self, type):
cls = self.__class__
count=0
while(True):
count=count+1
if count==30:
count=0
self.send_message(type)
elif cls.clients_connected==0:
break
Python is able to count up to 30 in a fraction of a millisecond. That loop is calling send_message() basically as fast as it can, and never giving any time for the ioloop to process events (like other clients connecting).
Consider this (assuming python3.5 or later):
def on_message(self, message):
self.reqMsg=yaml.safe_load(message)
self.reqType=self.reqMsg.get("type")
ioloop.IOLoop.current().spawn_callback(self.doWithData, self.reqType)
async def doWithData(self, type):
while self in self.__class__.wsClients:
await gen.sleep(0.2)
await self.send_message(type)
async def send_message(self, req_type):
try:
self.returnJson = {"type":req_type, "result": "error: please start up ros service firstly."}
self.jsonStr = json.dumps(self.returnJson)
with self.handler_lock:
await self.write_message(self.jsonStr, True)
except Exception as exc :
print("%s : %s"%(req_type , exc))
Thanks for reply ! My project only on python 2.7, so I only test on python 2.7. and I found that the memory management of python 2.7 is not better as python 3, such as processing file stream , some interfaces used list and so on. Then , I will use python 3 for testing.
For python2.7 you can translate my example code by replacing async and await with @coroutine and yield respectively
Yes, I tried to run the code for python2.7 with @coroutine and yield, but the result is also as that which every call will use more memory. So now I only reduce the frequency.
With the above code, using python 2.7.15 and tornado 5.1.1, I get a constant 19 MiB of RSS memory usage. I connect one websocket client, send one message, get a constant stream of messages from the server for a minute or so, disconnect, reconnect, repeat.
(Final form: https://gist.github.com/ploxiln/18f0d53ac629604b088a404d69156aed)
You may be confused by the memory profiler: some memory is allocated during each send_message(), but should be freed after the ioloop is able to complete the send, which happens in a different context, while this asynchronous coroutine is yielding. (I'm also suspicious of how @profile interacts with @coroutine or async.)
I set 'websocket_ping_interval':50, to ping , so websocket is not disconnect. If I don't use memory profiler, use free -h on shell terminal , also found that the memory be used more .
I just had to downgrade from 5.1.1 to 4.5.3, because calling write_message ended up claiming memory that was never returned.
I tested this with 5.1.1 by commenting out the write_message line, and the memory would remain the same. Removing the comment and it would increase. Doing the same test with 4.5.3 resulted in no memory increase.
@jbwdevries oh, thanks! I change the version on my project as you said, the result as you said. 5.X.X would increase memory , 4.X.X wouldn't. @bdarnell @ploxiln , I hope to fix that, thank you!
@3shao and @jbwdevries , can you provide a sample program that demonstrates this issue? As @ploxiln 's code from Oct 19 shows, not all programs that use websockets result in this memory growth.
@bdarnell the sample program such as https://gist.github.com/3shao/178cba6ff29b7f865a39e29f649f3b95
And what about the client side?
In that program you're spawning a new doWithData task every time you receive a message. That task contains a loop, so after the first on_message you'll start sending 5 messages every second, after the second on_message you'll send 10 per second, etc. This will consume increasing amounts of memory unless your clients only send a single message per connection. It looks like you probably meant to spawn doWithData in open instead of on_message.
On that sample program, only one connection from the client side. In tornado 5.1.1, it would increasing more memory after some time, but 4.5.3 wouldn't. And on my real project, it would use all the memory in the enough time, so the system would kill the program.
@3shao I have the same issue. The key thing to the memory leak is when write_message happens inside a different thread or inside a threadpoolexecutor. That's the root cause. @3shao what memory profiler did you use? Indeed downgrading to tornado 4.5.3 - solves the whole issue.
I also encountered this problem. Environment: Python3.7 tornado==6.2.0 When callback write_message(), memory continues to grow, and the websocket process be killed (OOM)
My usage scenario is to send a large number of images to the front-end rendering. If image size greater than 1M, the problem will happen, else image size is small , it's ok.
I final solve this problem through using yield in whatever function is calling write_message. I hope my code is helpful to you.
def run(self):
main_loop = tornado.ioloop.IOLoop.instance()
main_loop.add_callback(self.send_message_task)
main_loop.start()
@tornado.gen.coroutine
def send_message_task(self):
while not self.exit_flag:
nxt = tornado.gen.sleep(0.01)
yield self.send_message()
yield nxt
@tornado.gen.coroutine
def send_message(self):
try:
data = {
"task_code": "A0001",
"message": "Hello World"
}
for client in WebSocketHandler.connections.copy():
if data["task_code"] == client.task_code:
yield client.write_message(data, True)
except Exception as e:
log.error("client push websocket error : {}".format(e))