Further timeouts and resource limiting.
- [ ] --timeout-request-start (Timeout unless request headers complete within this time. ~10s)
- [ ] --timeout-request-complete (Timeout if request body is still being read after this time. ~60s)
- [x] ~--limit-request-line-size (4096, See gunicorn)~
- [x] ~--limit-request-header-size (8190, See gunicorn)~
- [x] ~--limit-request-header-count (100, See gunicorn)~
- [x] --limit-ws-message-size
@Kludex notes
- We are not going to implement
--limit-request-header-count. The header size limitation is enough. - Should we consider timeout on
startupandshutdownlifespans?
[!IMPORTANT]
- We're using Polar.sh so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.
Hello,
Apologies for chiming in on this, but my use case could really use this feature. I have a very large sql query that uses sqlalchemy (so it is not async, from the start). This is an internal "big data" app, so there is no pressure for super low latency per request. I'm streaming lines of json with a starlette StreamingResponse from a fastapi application.
However, the query takes about 90-120 seconds to get started actually producing results. uvicorn appears to time out, whether i run it directly or under gunicorn. daphne has a timeout arg, so i'm using that now. I'd really like to use uvicorn as a work in gunicorn, to get easy multi-process service.
Anyway, I just wanted to bump this. If you pointed me to the code where the timeout actually occurs, I might be interested in trying a PR, if you'd be interested (at least for the request start, and the others if they fall out easily).
Thanks!
I guess https://github.com/encode/uvicorn/pull/995 is for --limit-ws-message-size, but we called it --ws-max-size instead.
For a timeout if a response is not returned within X seconds does simply --timeout sound good? I haven't looked deeply into the codebase yet but if implementing a timeout for http requests will require its own code seperate from WS requests, then we could do --http-timeout to be similar to Uwsgi: https://uwsgi-docs.readthedocs.io/en/latest/Options.html#http-timeout
@hoopes I could be mistaken but from my initial reading of the codebase, there's no hardcoded timeout for http requests. It's probably some other issue that's causing uvicorn to not return a response.
@caleb15 ok! sorry for not following up strongly here, my work has taken me in other directions. my apologies for dirtying up a thread.
So I implemented a timeout using a middleware in our django codebase:
class TimeoutMiddleware:
"""
Times out a request after X seconds
"""
def __init__(self, app: ASGI3Application) -> None:
self.timeout = settings.HTTP_TIMEOUT
self.app = app
async def __call__(
self, scope: Scope, receive: ASGIReceiveCallable, send: ASGISendCallable
) -> None:
# TimeoutError? If this happens often you should either optimize
# the code or route the endpoint to the long pods, which
# have a higher timeout
try:
await asyncio.wait_for(self.app(scope, receive, send), self.timeout)
except asyncio.TimeoutError as e:
sentry_sdk.capture_exception(e)
await send(
{
'type': 'http.response.start',
'status': 504,
}
)
await send(
{
'type': 'http.response.body',
'body': b'504 Error - Your request took too long and was cancelled.',
}
)
With initial testing this appeared to work fine. However, when the code got pushed to production a very large number of timeout errors started happening. Upon further testing I realized that the timeout includes the time requests spend waiting to be processed behind other requests. This unfairly penalizes requests that are stuck for a long time. For example, if a request waited for 5 seconds and took only 2 seconds to process with a timeout of 6 seconds, it would get cancelled because 5+2 > 6.
I think I would need to know more about uvicorn to implement this properly. How does the "queuing" logic work? I'm in the dark as to how requests are distributed among threads and where they are queued up when a thread is busy.
I think I would need to know more about uvicorn to implement this properly. How does the "queuing" logic work? I'm in the dark as to how requests are distributed among threads and where they are queued up when a thread is busy.
I'm not sure I understand what your question is about threads, as everything happens in the main thread and the event loop is scheduling the instructions on it.
to get a better sense of the "queuing" logic you may want to read the protocols part @caleb15 (https://github.com/encode/uvicorn/blob/master/uvicorn/protocols/http/httptools_impl.py)
In particular the RequestResponseCycle part
--limit-request-header-count is implemented in ~#1682~ #1683.
I think --limit-request-header-size can be considered equivalent to h11-max-incomplete-event-size in the h11 implementation, which we already have.
EDIT: Actually, the h11-max-incomplete-event-size is a replacement for both --limit-request-header-size and --limit-request-line-size:
max_incomplete_event_size (int) – The maximum number of bytes we’re willing to buffer of an incomplete event. In practice this mostly sets a limit on the maximum size of the request/response line + headers. If this is exceeded, then next_event() will raise RemoteProtocolError.
Ref.: https://h11.readthedocs.io/en/latest/api.html#h11.Connection
httptools does have a mechanism similar to max-imcomplete-event-size, but we can't change the value.