treq
treq copied to clipboard
Timeout whilst getting response's body
Hi,
I've started using treq on unstable network and I realized that there is no way to define timeout while getting response content. So sometimes I can't process request as it is hanged.
My idea to solve that is to pass request_kwargs to _Response and _BufferedResponse and collect function should look like:
def collect(response, request_kwargs, collector):
"""
Incrementally collect the body of the response.
This function may only be called **once** for a given response.
:param IResponse response: The HTTP response to collect the body from.
:param collector: A callable to be called each time data is available
from the response body.
:type collector: single argument callable
:rtype: Deferred that fires with None when the entire body has been read.
"""
if response.length == 0:
return succeed(None)
d = Deferred()
response.deliverBody(_BodyCollector(d, collector))
if request_kwargs.get('timeout'):
delayedCall = default_reactor(request_kwargs.get('reactor')).callLater(
request_kwargs.get('timeout'), d.cancel)
def gotResult(result):
if delayedCall.active():
delayedCall.cancel()
return result
d.addBoth(gotResult)
return d
Guys, can you confirm that issue makes sense? If yes, I'll prepare more proper fix with tests.
Cheers!
Twisted since 16.5 has deferred.addTimeout function that can be used instead of above code. For example, I've a function that wraps getting response that adds timeout to the returned deferred. This ensures that total time taken to get request status and content does not exceed a limit.
But not everyone will be on 16.5 and do you want to make that a requirement?
I have seen some connection issues too where we define a timeout on the request's deferred but it does not drop the connection. From the looks of it, this does nothing different. If you cancel the returned deferred it will not tell the response to stop downloading or close the socket connection. It will raise a defer.CancelledError in the caller, but it won't clean up any resources. This was a recent problem for my company with using treq/twisted and we are still trying to figure out how to solve it properly.
I think we need to call loseConnection or something like that on the transport. Still trying to figure that part out.
We figured it out. There is a connection leak possible during the response content download phase. Treq does not abort the connection and close the underlying socket.
Look at readBody from twisted: https://github.com/twisted/twisted/blob/trunk/src/twisted/web/client.py#L2207 It checks the transport and calls abortConnection on it if the deferred cancels.
Treq implements something similar to readBody, but never takes this into account: https://github.com/twisted/treq/blob/master/src/treq/content.py#L63
We noticed this problem in a major way when when one of our external APIs started to hang during content download. Even though we timed out and canceled the deferred, the connections remained open. In turn, we hit the maximum open file/connection limit on the OS and that stopped new connections from being made.
I would consider this a bug in collect or _BodyCollector. Upon cancellation, it needs to try to close the underlying connection.
Note that due to https://twistedmatrix.com/trac/ticket/8227 / https://twistedmatrix.com/trac/ticket/8929 that abortConnection call is never made. So Twisted has this bug too :frowning_face:
Oh man. Thank you for pointing that out.
This is what happens when your abstractions have abstractions.