daphne
daphne copied to clipboard
TCP bufferbloat if WebSocket server keeps pushing data quickly to a slow client
This may be able to be solved by registering a twisted producer and when twisted calls pauseProducing, daphne can just disconnect the client.
Autobahn exports this interface through WebSocket protocols.
Could you flesh this out a bit more with how you detected this? That would be valuable for whoever picks this up, so they can verify a fix.
I just managed to produce this issue.
- let worker send many messages to reply channel
- let the client get into an infinite loop upon first event.
- no other thing like load balancer in between.
- set daphne ping timeout to a bigger vlaue
I observed that daphne's memory usage kept growing (by MB). Reducing ping timeout may help? But that assumes memory usage won't go too high during the timeout.
also, the ping timeout won't work. because it updates "last_data" even for server side send!
I think that's another bug. When server is sending data, it doesn't mean the client/connection is good.
I tried to install push producer, and captured pauseProducing call from twisted. A trick is I have to unregister the previous producer (HTTPChannel)
# in onConnect:
self.transport.unregisterProducer()
self.registerProducer(PushProducer(self), True)
Ideally, daphne should forward pauseProducing and resumeProducing to worker
I've had what I think is a similar/same problem when testing https://github.com/django/django/pull/16384 if generating lots of data from Django (which in a project we do, generating files on-the-fly).
Carlton had some code: https://github.com/django/django/pull/16384#issue-1496410480
Having something a view such as:
async def generate():
gb_to_send = 5
chunk_size = 5 * 1024 * 1024
total_sent = 0
count = 0
while total_sent < gb_to_send * 1024 * 1024 * 1024:
data = f"{count % 10}" * chunk_size
total_sent += len(data)
count += 1
await asyncio.sleep(0.000001) # change it to make slower / faster
yield data
async def a_streaming_view(request):
return StreamingHttpResponse(generate())
And then using curl and then stopping curl (Control+Z on Linux/Mac shells) or even quitting (Control+C): data is generated using lots of RAM. I can provide a better example if needed / useful.
Hey @cpina - yes please. If you're able to focus in on what's happening here that would be amazing. (Current plan is to swing back here after Django 4.2a1, so any work before then would be extra handy 🎁)
:+1: I will prepare a self-contained example and write my findings in daphne-Twisted code that might help, hopefully!
Self contained example to see memory increase: https://gist.github.com/cpina/fe1e3fa982d09997a5957441b97c5d0c
It is the first time that I dive into daphne and Twisted so take the next hypothesis with a pinch of salt!
It's possible to see what I think is the size of what needs to be sent to the client in Daphne via (horrible):
self.channel.transport._tempDataLen
In daphne/http_protocol.py line 265 just before http.Request.write(self, message.get("body", b""))
Also, it seems that Twisted would like to stop the producer since twisted/internet/abstract.py
, method _maybePauseProducer
is executed and if self._isSendBufferFull()
returns True. It calls self.producer.pauseProducing()
(twisted/web/http.py HttpChannel.pauseProducing) but it cannot stop the producer... but I don't know at this point what "Producer" should be, how it should be stopped, how Daphne should set it or if this is a red herring at all or the right path.
Hopefully this helps somehow! I'm happy to test any possible changes or try to fix it (I need to familiarise myself with the related code first).