websockets icon indicating copy to clipboard operation
websockets copied to clipboard

websockets client 2x slower than synchronous websocket-client 80K/s vs 40K/s

Open sasha-id opened this issue 3 years ago • 6 comments

If I run it over the network it twise slower than sync client. But it faster if I run it on the same machine. Sync client has same throughput on local host or over the network. 🤷‍♂️ Is there any way to improve performance to be in line with sync client?

Test server:

import asyncio
from glob import glob
from pyexpat.errors import messages
import websockets
import time, uvloop

# uvloop makes asyncio 2-4x faster. - no difference
uvloop.install()


async def echo(websocket):
    while True:
        await websocket.send(f'{{"ev": "T", "sym": "MSFT", "x": 4, "i": "12345", "z": 3, "p": 114.125, "s": 100, "c": [ 0, 12 ], "t": {time.time_ns()/1000}, "q": 3681328}}')

async def main():
    async with websockets.serve(echo, "0.0.0.0", 8765):
        await asyncio.Future()  # run forever

asyncio.run(main())

sync client: https://github.com/websocket-client/websocket-client Througput ~80K/s

import websocket
import time
from threading import Thread

messages=[]
def count():
    global messages
    while True:
        time.sleep(1)
        print(len(messages))
        messages =[]

def on_message(ws, message):
    global messages
    messages.append(message)
    # print(message)

ws = websocket.WebSocketApp("ws://rocket:8765", on_message=on_message)
t1 = Thread(target=count)
t1.start()
ws.run_forever(skip_utf8_validation=True)

asyncio client Througput ~40K/s Througput for byte messages: ~49K/s (utf-8 validation is disabled)

import asyncio,  orjson, time
import websockets
import uvloop


class Client:
    def __init__(self) -> None:
        self.messages = []

    async def count(self):
        while True:
            await asyncio.sleep(1)
            print(len(self.messages))
            self.messages=[]

    async def hello(self):
        async with websockets.connect("ws://rocket:8765", max_queue=100) as websocket:
            while True:
                msg = await websocket.recv()
                self.messages.append(msg)
                # print(msg)
                

    def start(self):
        uvloop.install()
        loop = asyncio.get_event_loop()
        loop.create_task(self.count())
        loop.create_task(self.hello())
        loop.run_forever()
Client().start()

sasha-id avatar Jun 24 '22 16:06 sasha-id

Could you alter the server as follows:

PAYLOAD = 1024 * 1024 * b"a"

async def echo(websocket):
    while True:
        await websocket.send(PAYLOAD)
        await asyncio.sleep(0)  # ensure we don't block the event loop by sending continuously

then run the benchmark over the network (not locally) and tell me if websockets is still 2x slower than websocket-client?

aaugustin avatar Jun 25 '22 15:06 aaugustin

With 1 mb payload it's actually twice faster, but in my case payload is ~150bytes Also with 1kb payload sync client x2 faster 70K vs 30K

sasha-id avatar Jun 28 '22 00:06 sasha-id

OK - that's what happens when you benchmark a library that implements compression vs. a library that doesn't. What if you disable compression in websockets with connect(compression=None)?

There's a bunch of other factors at play. Here you challenge websockets; but what's the overhead of asyncio? If you run an equivalent benchmark on a plain TCP connection, what happens?

What if you discarded 8 bytes from the connection (WebSocket header) and read the next 150 bytes? I'm pretty sure you could make this work and the benchmark would be faster.

Surely this sounds unreasonable; however, the level of functionality provided by websocket-client is halfway between this dumb implementation and what websocket does.

In short: benchmarking is tricky :-)

The default settings of websockets are considered reasonable for a wide range of use cases. They aren't micro-optimized for a single client connection receiving tens of thousands of messages per second — a pretty extreme use case compared to the sort of application WebSocket was designed for — basically, online chat.

Probably max_queue=100 helps if you care more about throughput that latency. Setting ping_interval=None might help. Tuning asyncio buffers might help too.

Finally, if you want to remove the sync I/O vs. async I/O difference — probably async incurs a performance penalty for your single threaded use case — you could try this branch: https://github.com/aaugustin/websockets/pull/885

This, plus disabling all features that websocket-client doesn't provide (and perhaps that you don't need), would provide a more meaningful benchmark.

aaugustin avatar Jun 28 '22 06:06 aaugustin

Thank you for your help Aymeric! With compression=None and ping_interval=None your client is faster, I'm getting ~90K/s 🚀

sasha-id avatar Jun 28 '22 13:06 sasha-id

Well I'm relieved to hear this.

I was bracing for "cool, bro, now you're doing 42k/s instead of 40k/s; websocket-client still at 80k/s" :-)

aaugustin avatar Jun 28 '22 14:06 aaugustin

Flagging as a documentation issue for adding to the FAQ.

aaugustin avatar Jun 28 '22 14:06 aaugustin