tornado
tornado copied to clipboard
Create a AsyncHTTPClient request with infinite timeout
Is there a defined way to create a request using AsyncHTTPClient
with an infinite timeout? I am currently using the simple httpclient implementation (should also not change) and I am requesting a page that streams an infinite amount of data. I am using streaming_callback
to process the chunks of data.
With Python 3.6.4 and Tornado 5.0.2 I set request_timeout=0
on the fetch method call which worked. However, I recently upgraded to Python 3.7.4 and newest Tornado 6. When I set request_timeout=0
here, the future also never completes. However, the client will not even send a HTTP request to the destination. An example of this is the following:
from tornado.httpclient import AsyncHTTPClient
import asyncio
URL='http://localhost:8000/test'
async def test():
ahc = AsyncHTTPClient()
result = await ahc.fetch(URL, streaming_callback=chunker, request_timeout=0)
def chunker(data):
print(data)
asyncio.get_event_loop().run_until_complete(test())
The program will not complete, but on the local server, there will not be a request, either. And, subsequently, chunker
will also never be called. I can work around this by setting request_timeout=float('inf')
but this feels really hacky and I fear that there may be a timeout at some point in the future anyways.
So question is if there is a way to never timeout the fetch request.
There's not currently a supported way to set an infinite timeout. If there were, it would be some special value (maybe None) instead of zero. If the Future doesn't resolve with a timeout of zero, that's a bug - you should get an immediate timeout error.
I'm kind of surprised that float('inf')
works, but setting a large finite timeout (a day? a year?) is fine, and that's what I'd do.
Is there an interest to add this kind of infinite timeout? I assume changing the meaning of None
for the timeout now would break a lot of existing code. Would you consider a pull request that introduces an additional parameter that disables the request timeout completely?
While setting the timeout to a large value would be a workaround, I am afraid of limitations of the underlying operating system in that regard. After all, testing a year-long timeout is not practically possible, so having a way to disable the timeout completely seems like the cleanest solution to me.
The relevant code in master (and the 6.0 branch) for simple_httpclient
:
timeout = min(self.request.connect_timeout, self.request.request_timeout)
if timeout:
self._timeout = self.io_loop.add_timeout(
self.start_time + timeout,
functools.partial(self._on_timeout, "while connecting"),
)
stream = await self.tcp_client.connect(
There is no else:
branch to the if timeout:
, so that's why timeouts set to 0 cause nothing to happen, it just skips the stuff. I'd call that a bug.
Currently you can't set None
, it will raise an exception from min()
. So there is currently no meaning for None
that we have to worry about, it just causes exceptions. We are free to add a meaning.
There's also another place in the code where it is clear that None
is not allowed (but you won't run into this unless you have more than max_clients - default 10 - concurrent requests):
if not len(self.active) < self.max_clients:
assert request.connect_timeout is not None
assert request.request_timeout is not None
timeout_handle = self.io_loop.add_timeout(
self.io_loop.time()
+ min(request.connect_timeout, request.request_timeout),
functools.partial(self._on_timeout, key, "in request queue"),
)
Currently you can't set None, it will raise an exception from min(). So there is currently no meaning for None that we have to worry about, it just causes exceptions. We are free to add a meaning.
HTTPRequest(url, request_timeout=None)
does have a meaning - it means "use the default". The logic around defaults and None is tricky and I don't think we can overload None with a different meaning without causing confusion.
We could create a special singleton object that would mean "no timeout", but I'm not sure it's worth it. Just setting a large but finite timeout seems like a better approach to me.