High RAM memory usage
Hello, I've tested multiple websockets libraries in python to create clients. However each time I create websocket client it allocates ~1.5MB of memory. Is this behaviour expected?
Running this code bumps memory from ~50MB to ~65MB on my machine (as I inspect task manager).
import websockets
async def main():
async def startWebsocket():
async with websockets.connect("wss://echo.websocket.org") as ws:
await asyncio.sleep(99999999)
tasks = []
for _ in range(10):
tasks.append(startWebsocket())
await asyncio.gather(*tasks)
asyncio.run(main())
This code keeps memory usage at ~50MB:
import websockets
async def main():
await asyncio.sleep(99999)
asyncio.run(main())
import websockets barely imports anything because websockets includes an automatic, on-demand import mechanism to optimize memory usage. Accessing websockets.connect triggers the import of the asyncio client and all its dependencies. I suspect that the extra 15MB come (mostly) from there.
Can you try import websockets.connect in the second example and see if that lands you around 65MB?
I benchmarked memory use carefully so I'm quite confident that websockets is at least as sane as most other Python libraries. I expect each connection to use about 14kB of RAM as echo.websocket.org doesn't enable compression (I checked).
@aaugustin Thank you for reply. I have actually tested with more websockets as well. For example this code bumps memory to ~161MB (which is roughly still the same ~1.1MB per websocket):
import asyncio
import websockets
import random
async def main():
async def startWebsocket():
await asyncio.sleep(random.randint(0, 20)) #add some delay so I don't get timed out from connecting to echo.wesbocket.org
async with websockets.connect("wss://echo.websocket.org") as ws:
await asyncio.sleep(99999999)
tasks = []
for _ in range(100):
tasks.append(startWebsocket())
await asyncio.gather(*tasks)
asyncio.run(main())
And then I tried playing with for loop range. for _ in range(200) yields ~264MB and so on. So memory increase per one websocket seems to be increasing linearly (~1MB per websocket).
I believe websocket should take no more than few kB as you have pointed out. I tried to remove all unnecessary imports and only work with code I provided above and the result is the same. Initially python program without websockets creation takes ~10MB and then after creation of 100 websockets it jumps to +100MB.
I also tried to inspect if the problem is with asyncio for some reason. And this code keeps memory at ~10MB so asyncio works perfectly fine.
import asyncio
async def main():
async def startWebsocket():
await asyncio.sleep(500)
tasks = []
for _ in range(100):
tasks.append(startWebsocket())
await asyncio.gather(*tasks)
asyncio.run(main())
OK, that's unexpected.
Note that benchmarking with task manager is quite crude: it tells your how much memory the system gave to Python, but not how much Python is actually using. To get reliable results, usually:
- I run
gc.collect()before measuring memory usage to make sure I measure only live objects; - I look at the exact memory allocations with
tracemalloc— that's where the 14kB figure comes from.
However, I'm not confident that's the explanation for a problem of the order of magnitude that you're reporting. It looks more like a large buffer gets allocated for each connection.
What OS are you running?
Here's a version instrumented with tracemalloc:
import asyncio
import random
import tracemalloc
from websockets import connect
async def main():
async def startWebsocket():
await asyncio.sleep(random.randint(0, 20)) #add some delay so I don't get timed out from connecting to echo.wesbocket.org
async with connect("wss://echo.websocket.org") as ws:
await asyncio.sleep(99999999)
tasks = []
tracemalloc.start()
snapshot1 = tracemalloc.take_snapshot()
print("current, peak = ", tracemalloc.get_traced_memory())
for _ in range(100):
tasks.append(asyncio.create_task(startWebsocket()))
await asyncio.sleep(22) #wait until all tasks are started
snapshot2 = tracemalloc.take_snapshot()
print("current, peak = ", tracemalloc.get_traced_memory())
stats = snapshot2.compare_to(snapshot1, 'lineno')
for stat in stats[:10]:
print(stat)
await asyncio.gather(*tasks)
asyncio.run(main())
Here's what I'm getting on macOS:
current, peak = (656, 656)
current, peak = (28778742, 29040338)
/Users/myk/.pyenv/versions/3.13.1/lib/python3.13/asyncio/sslproto.py:278: size=25.0 MiB (+25.0 MiB), count=200 (+200), average=128 KiB
/Users/myk/dev/websockets/src/websockets/datastructures.py:110: size=231 KiB (+231 KiB), count=4381 (+4381), average=54 B
/Users/myk/.pyenv/versions/3.13.1/lib/python3.13/ssl.py:886: size=181 KiB (+181 KiB), count=3104 (+3104), average=60 B
/Users/myk/.pyenv/versions/3.13.1/lib/python3.13/asyncio/sslproto.py:347: size=155 KiB (+155 KiB), count=200 (+200), average=792 B
<frozen importlib._bootstrap_external>:784: size=93.4 KiB (+93.4 KiB), count=1316 (+1316), average=73 B
/Users/myk/dev/websockets/src/websockets/datastructures.py:111: size=83.8 KiB (+83.8 KiB), count=1503 (+1503), average=57 B
/Users/myk/dev/websockets/src/websockets/asyncio/messages.py:32: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
/Users/myk/dev/websockets/src/websockets/asyncio/connection.py:136: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
/Users/myk/.pyenv/versions/3.13.1/lib/python3.13/asyncio/sslproto.py:309: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
/Users/myk/.pyenv/versions/3.13.1/lib/python3.13/asyncio/selector_events.py:788: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
It looks like TLS is expensive: it adds 25MB on my machine; likely this is caused by allocating buffers. That's 90% of the memory usage — and I didn't expect it as I hadn't benchmarked with TLS.
Also I don't know why we're getting 200 allocations for many objects when we should have 100. That isn't the core of the problem here though.
Can you try running the instrumented version and post your results?
If you want to benchmark websockets vs. asyncio, you could open TLS connections to echo.websocket.org; probably you would get similar memory usage to what you're getting with websockets.
@aaugustin Confirming your findings. I got very similar results as you. The issue lies in TLS connection - it is the source of high memory usage. If we use wss rather than ws the memory grows as I said. Feel free to test with for example this ws server ws://websocket-echo-intustack.koyeb.app. There is no +25.0 MiB for SSL in case of 100 websocket connections. And there is no +50.0MiB for SSL in case of 200 websocket connections.
This "issue" is not related solely to this library. Memory grows rapidly even when using https rather than http with any popular request library where you create request session and don't close it.
I have tested even other websocket libraries and when I use wss over ws, the memory grows rapidly. I guess it is the price we have to pay for TLS connection.
I find this issue resolved, thank you for your help @aaugustin :)
Taking a fresh look at the tracemalloc output this morning, I located the culprit.
asyncio's SSLProtocol pre-allocates a read buffer: https://github.com/python/cpython/blob/46cbdf967ada11b0286060488b61635fd6a2bb23/Lib/asyncio/sslproto.py#L278
The default size that buffer is 256kB: https://github.com/python/cpython/blob/46cbdf967ada11b0286060488b61635fd6a2bb23/Lib/asyncio/sslproto.py#L264
256kB / connection explains the 25MiB for 100 connections that you observed.
This warrants updating the documentation of memory usage. Tagging as a documentation issue.
Memory usage is rarely a concern for clients; usually they open 1 connection :-)
For servers, you can avoid this cost by terminating TLS in a reverse-proxy, before reaching Python.
One more thing I noticed on WindowsOS is the default use of proactor_events which cause 32.0 KiB additional memory allocation per connection.
Whereas macOS uses selector_events which cause only 380 B additional memory allocation per connection.
Here are my results on WindowsOS for 100 connections:
current, peak = (624, 624)
current, peak = (35179612, 35442416)
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:278: size=25.0 MiB (+25.0 MiB), count=200 (+200), average=128 KiB
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\proactor_events.py:191: size=6406 KiB (+6406 KiB), count=200 (+200), average=32.0 KiB
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:110: size=221 KiB (+221 KiB), count=4181 (+4181), average=54 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\ssl.py:877: size=181 KiB (+181 KiB), count=3103 (+3103), average=60 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:347: size=155 KiB (+155 KiB), count=200 (+200), average=792 B
<frozen importlib._bootstrap_external>:753: size=85.8 KiB (+85.8 KiB), count=1299 (+1299), average=68 B
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:111: size=79.6 KiB (+79.6 KiB), count=1429 (+1429), average=57 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\messages.py:32: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\connection.py:134: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:309: size=74.2 KiB (+74.2 KiB), count=200 (+200), average=380 B
And these are results on WindowsOS for 200 connections:
current, peak = (624, 624)
current, peak = (70419144, 70680884)
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:278: size=50.0 MiB (+50.0 MiB), count=400 (+400), average=128 KiB
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\proactor_events.py:191: size=12.5 MiB (+12.5 MiB), count=400 (+400), average=32.0 KiB
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:110: size=458 KiB (+458 KiB), count=8672 (+8672), average=54 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\ssl.py:877: size=362 KiB (+362 KiB), count=6204 (+6204), average=60 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\connection.py:1007: size=309 KiB (+309 KiB), count=400 (+400), average=792 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:347: size=309 KiB (+309 KiB), count=400 (+400), average=792 B
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:111: size=170 KiB (+170 KiB), count=3049 (+3049), average=57 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\messages.py:32: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\connection.py:134: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:309: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
As you can see proactor_events grows linearly and causes some memory overhead for each connection. But this is only related to WindowsOS. As far as I read, macOS does not use the Proactor event loop, macOS uses selector_events which is much more efficient memory wise.
If you want to use selector_events on WindowsOS simply use this line of code asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()). For reference here are the results for 200 connections after setting selector_events on WindowsOS:
current, peak = (624, 624)
current, peak = (57395194, 57654951)
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:278: size=50.0 MiB (+50.0 MiB), count=400 (+400), average=128 KiB
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:110: size=463 KiB (+463 KiB), count=8776 (+8776), average=54 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\ssl.py:877: size=362 KiB (+362 KiB), count=6201 (+6201), average=60 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\connection.py:1007: size=309 KiB (+309 KiB), count=400 (+400), average=792 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:347: size=309 KiB (+309 KiB), count=400 (+400), average=792 B
C:\Files\venv\Lib\site-packages\websockets\datastructures.py:111: size=172 KiB (+172 KiB), count=3091 (+3091), average=57 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\messages.py:32: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
C:\Files\venv\Lib\site-packages\websockets\asyncio\connection.py:134: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\sslproto.py:309: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
C:\Users\user\AppData\Local\Programs\Python\Python312\Lib\asyncio\selector_events.py:798: size=148 KiB (+148 KiB), count=400 (+400), average=380 B
Interesting. That being said, unless you are really memory constrained, I recommend sticking with the default event loop on Windows, as it's expected to have better I/O performance.