tornado icon indicating copy to clipboard operation
tornado copied to clipboard

Create a AsyncHTTPClient request with infinite timeout

Open polygon opened this issue 4 years ago • 4 comments

Is there a defined way to create a request using AsyncHTTPClient with an infinite timeout? I am currently using the simple httpclient implementation (should also not change) and I am requesting a page that streams an infinite amount of data. I am using streaming_callback to process the chunks of data.

With Python 3.6.4 and Tornado 5.0.2 I set request_timeout=0 on the fetch method call which worked. However, I recently upgraded to Python 3.7.4 and newest Tornado 6. When I set request_timeout=0 here, the future also never completes. However, the client will not even send a HTTP request to the destination. An example of this is the following:

from tornado.httpclient import AsyncHTTPClient
import asyncio

URL='http://localhost:8000/test'

async def test():
    ahc = AsyncHTTPClient()
    result = await ahc.fetch(URL, streaming_callback=chunker, request_timeout=0)

def chunker(data):
    print(data)

asyncio.get_event_loop().run_until_complete(test())

The program will not complete, but on the local server, there will not be a request, either. And, subsequently, chunker will also never be called. I can work around this by setting request_timeout=float('inf') but this feels really hacky and I fear that there may be a timeout at some point in the future anyways.

So question is if there is a way to never timeout the fetch request.

polygon avatar Sep 12 '19 14:09 polygon

There's not currently a supported way to set an infinite timeout. If there were, it would be some special value (maybe None) instead of zero. If the Future doesn't resolve with a timeout of zero, that's a bug - you should get an immediate timeout error.

I'm kind of surprised that float('inf') works, but setting a large finite timeout (a day? a year?) is fine, and that's what I'd do.

bdarnell avatar Sep 13 '19 13:09 bdarnell

Is there an interest to add this kind of infinite timeout? I assume changing the meaning of None for the timeout now would break a lot of existing code. Would you consider a pull request that introduces an additional parameter that disables the request timeout completely?

While setting the timeout to a large value would be a workaround, I am afraid of limitations of the underlying operating system in that regard. After all, testing a year-long timeout is not practically possible, so having a way to disable the timeout completely seems like the cleanest solution to me.

polygon avatar Sep 17 '19 16:09 polygon

The relevant code in master (and the 6.0 branch) for simple_httpclient:

            timeout = min(self.request.connect_timeout, self.request.request_timeout)
            if timeout:
                self._timeout = self.io_loop.add_timeout(
                    self.start_time + timeout,
                    functools.partial(self._on_timeout, "while connecting"),
                )
                stream = await self.tcp_client.connect(

There is no else: branch to the if timeout:, so that's why timeouts set to 0 cause nothing to happen, it just skips the stuff. I'd call that a bug.

Currently you can't set None, it will raise an exception from min(). So there is currently no meaning for None that we have to worry about, it just causes exceptions. We are free to add a meaning.

There's also another place in the code where it is clear that None is not allowed (but you won't run into this unless you have more than max_clients - default 10 - concurrent requests):

        if not len(self.active) < self.max_clients:
            assert request.connect_timeout is not None
            assert request.request_timeout is not None
            timeout_handle = self.io_loop.add_timeout(
                self.io_loop.time()
                + min(request.connect_timeout, request.request_timeout),
                functools.partial(self._on_timeout, key, "in request queue"),
            )

ploxiln avatar Sep 18 '19 01:09 ploxiln

Currently you can't set None, it will raise an exception from min(). So there is currently no meaning for None that we have to worry about, it just causes exceptions. We are free to add a meaning.

HTTPRequest(url, request_timeout=None) does have a meaning - it means "use the default". The logic around defaults and None is tricky and I don't think we can overload None with a different meaning without causing confusion.

We could create a special singleton object that would mean "no timeout", but I'm not sure it's worth it. Just setting a large but finite timeout seems like a better approach to me.

bdarnell avatar Sep 27 '19 15:09 bdarnell