starlette `RuntimeError("No response returned")` in `BaseHTTPMiddleware`

Previously, a "No response returned"-error has been discussed in this discussion: https://github.com/encode/starlette/discussions/1527, and was supposedly fixed in this PR: https://github.com/encode/starlette/pull/1715

However, the bug is still present. It's a race condition, so quite tough to reproduce, but the following reproducer works.

Reproducer

Use anyio 4.2.0 It does happen also with anyio 3.0.0, but it happens much more regularly on 4.2.

First, apply the following patch to the Starlette codebase, so that we have a breakpoint in case it happens:

--- a/starlette/middleware/base.py
+++ b/starlette/middleware/base.py
@@ -134,7 +134,8 @@ class BaseHTTPMiddleware:
             async def send_no_error(message: Message) -> None:
                 try:
@@ -163,6 +164,7 @@ class BaseHTTPMiddleware:
             except anyio.EndOfStream:
                 if app_exc is not None:
                     raise app_exc
+                breakpoint()
                 raise RuntimeError("No response returned.")

             assert message["type"] == "http.response.start"

Then run this code:

import asyncio
from hypercorn.asyncio import serve
from hypercorn.config import Config

from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
from starlette.middleware.base import RequestResponseEndpoint
from starlette.middleware import Middleware


class DummyMiddleware(BaseHTTPMiddleware):
    async def dispatch(
        self, request: Request, call_next: RequestResponseEndpoint
    ) -> Response:
        return await call_next(request)


async def homepage(request):
    return JSONResponse({})


app = Starlette(
    routes=[
        Route("/", homepage),
    ],
    middleware=[Middleware(DummyMiddleware) for i in range(100)],  # Yes, 100x the same middleware.
)

config = Config.from_mapping({})
config.bind = ["127.0.0.1:8000"]
asyncio.run(serve(app, config))

Open either Firefox or Chrome, navigate to http://localhost:8000 and hold down command-r (refresh) to repeatedly have the browser refresh the page. After a couple of seconds, we'll enter the breakpoint.

(Feel free to turn this issue into a discussion if that's more appropriate. I'm also looking for a fix but have a hard time understanding BaseHTTPMiddleware.)

edit: I updated the above snippet and added the missing imports.

This PR contains a possible fix: https://github.com/encode/starlette/pull/2519

[!IMPORTANT]

We're using Polar.sh so you can upvote and help fund this issue.

We receive the funding once the issue is completed & confirmed by you.

Thank you in advance for helping prioritize & fund our backlog.

Feb 16 '24 11:02 jonathanslenders

versions:

python=3.11
starlette=0.37.1
uvicorn=0.27.1

@jonathanslenders this also happens for me, and seems to also be due to client disconnects. it started to happen heavily after an upgrade of the Starlette version (previously we never got this exception). I have just checked our upgrade, and that was from 0.27.0 to 0.36.3 - so quite a few versions though.

I have modified your example a bit, and interestingly in my example it consistently works when you have 3x middleware, and breaks when you have +4x middlewares.

import asyncio

from starlette.applications import Starlette
from starlette.middleware import Middleware
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
from starlette.routing import Route


class DummyMiddleware(BaseHTTPMiddleware):
    def __init__(self, **kwargs) -> None:
        self._version = kwargs.pop("version")
        super().__init__(**kwargs)

    async def dispatch(self, request: Request, call_next) -> Response:
        print(self._version, id(self), "STARTED")
        try:
            res = await call_next(request)
            print(self._version, id(self), "COMPLETED")
            return res
        except Exception as exc:
            print(self._version, id(self), "FAILED", exc)
            return Response(b"")


async def sleepy(request: Request):
    sleep_ms = int(request.query_params.get("sleep", 200))
    await asyncio.sleep(sleep_ms / 1000)
    return Response(b"")


app = Starlette(
    routes=[Route("/", sleepy)],
    middleware=[Middleware(DummyMiddleware, version=_ + 1) for _ in range(4)],
)

Serve the app with Uvicorn

uvicorn app:app --reload

Make a CURL request where you can modify the sleep time that asyncio will run inside the app and the corresponding max execution time CURL will allow -m 0.2

curl http://127.0.0.1:8000?sleep=250 -m 0.2

OUTPUT

1 4392769744 STARTED
2 4392769616 STARTED
3 4392769232 STARTED
4 4392768208 STARTED
4 4392768208 COMPLETED
3 4392769232 COMPLETED
2 4392769616 COMPLETED
1 4392769744 FAILED No response returned.

Changing number of middlewares to 3

1 4368290960 STARTED
2 4368290768 STARTED
3 4368290576 STARTED
3 4368290576 COMPLETED
2 4368290768 COMPLETED
1 4368290960 COMPLETED

It seems to make no difference whether you place the sleep inside the middleware or inside the endpoint function
It seems like it consistently works with 3 middlewares and consistently breaks with +4 middlewares.
It seems like the middlewares are initialised in the order they are passed, but resolved in the reverse way, it's consistently the first middleware passed that will also fail, though the other middlewares will pass through correctly before
When using 3 middlewares, it seems to also not be possible to provoce the error with a combination of different sleep times

After some further debugging, this seems to appear exactly from starlette>=0.28.0 and seems to could be in relation to introduction of the _CachedRequest. I'ld have to dive a bit deeper into understanding it more specifically. However, I have tried with combinations of anyio and that doesn't seem to impact anything, only whenever I downgrade to starlette 0.27.0 it works regardsless of anyio version.

Not surprisingly this change in Starlette would not cause the issue anymore, though I don't imagine this is actually the solution:

- message = await wrap(wrapped_receive)
+ message = await wrap(receive)

@Kludex would you have any clue why it happens at >=4 middlewares but not <=3? Or does it just seem to be a coincidence that this behaviour happens?

Mar 04 '24 11:03 mikkelduif

Hi, seems like we encountered a similar issue here.

We are actually using fastapi and at first we suspect that we are facing https://github.com/tiangolo/fastapi/discussions/8187. However we noticed that the hang can only be reproduced when more than two people keep refreshing the page together. So, that looks more like a race condition, i.e. probably not related to https://github.com/tiangolo/fastapi/discussions/8187, which is a stream double-consume issue and hence should be reproducible by a single user's refresh, unconditionally, which is not our case.

Our starlette version is 0.36, and when two users are refreshing the page at the speed of ~one time per 2s, we can repro the hang issue in ~30 secs.

Mar 05 '24 02:03 XieJiSS

👋🏻 We think we're seeing this too, with 5 middlewares, on:

fastapi==0.109.2
starlette==0.36.3
anyio==4.3.0

May 07 '24 14:05 mithaler

We were seeing this as well on py3.12, dep versions:

fastapi==0.110.2
starlette==0.37.2
anyio==4.4.0

I'm downgrading us to:

fastapi==0.107.0
starlette==0.28.0
anyio==3.7.1

Which seems like it'll resolve the issue, but I'll report back after letting it run for a week.

Unfortunately, we can't bump versions of fastapi (therefore, starlette or anyio) in good conscience until this issue is resolved.

Jun 04 '24 18:06 adamsanghera

I would recommend you all try using pure ASGI middleware instead of BaseHTTPMiddleware. I'm confident it will solve the issues you're seeing.

Jun 04 '24 19:06 adriangb

I would recommend you all try using pure ASGI middleware instead of BaseHTTPMiddleware. I'm confident it will solve the issues you're seeing.

On our side at least, we're not explicitly choosing to use BaseHTTPMiddleware.

Rather, fastapi (IMO, reasonably?) makes this decision for us in fastapi.applications.FastApi.middleware.

Maybe this is a personal weakness, but I feel more comfortable downgrading to a known-good version of these dependencies and waiting for the issue to be fixed, than I do subclassing FastAPI and overwriting middleware to use a pure ASGI middleware.

Regardless, thank you! It's good to know that's a possible recourse if we want to upgrade fastapi at some future time and this hasn't been resolved by then .

Jun 04 '24 19:06 adamsanghera

You don't have to subclass FastAPI. Just follow the Starlette docs here or use FastAPI dependencies instead.

Jun 04 '24 20:06 adriangb

Hey starlette team! First of all thanks for the great ASGI framework you've created. I wanted to chime in the conversation as we're seeing these issues as well by using >=4 middlewares as @mikkelduif mentioned. I was digging a little bit inside the code and seems to be an issue with the stream() AsyncGenerator consumed inside _CachedRequest class. I've noticed that the AsyncGenerator is initialised inside the class as self._wrapped_rc_stream: https://github.com/encode/starlette/blob/5a1bec33f8d6a669a3670f51034de83292d19408/starlette/middleware/base.py#L33

however we consume self.stream() instead of self._wrapped_rc_stream

https://github.com/encode/starlette/blob/5a1bec33f8d6a669a3670f51034de83292d19408/starlette/middleware/base.py#L83-L84

however as @adriangb suggested we will move to using pure ASGI middlewares and remove all of our HTTPBaseMiddleware since I'm reading from discussions it will provide performance efficiency as well.

Jun 07 '24 09:06 AntonisAgap

We're seeing the issue as well, forcing us to use fewer middlewares.

Please fix.

Jun 13 '24 18:06 liamvdv

Hi folks please try https://github.com/encode/starlette/pull/2620 and see if it fixes your issue. I do still recommend you avoid BaseHTTPMiddleware.

Jun 13 '24 19:06 adriangb

I checked out #2620, and it solved it! Thanks so much!

Jun 21 '24 11:06 jlandmann

starlette starlette copied to clipboard

`RuntimeError("No response returned")` in `BaseHTTPMiddleware`

Reproducer

starlette
starlette copied to clipboard