starlette icon indicating copy to clipboard operation
starlette copied to clipboard

Preserve changes to contexvars made in threadpools

Open adriangb opened this issue 3 years ago โ€ข 14 comments

I think this addresses https://github.com/tiangolo/fastapi/issues/953.

Thank you @smagafurov for the link in https://github.com/tiangolo/fastapi/issues/953#issuecomment-730302314

What does this do for users?

This fixes a confusing behavior where their endpoint works differently depending on if it is sync or async. And in FastAPI (which for better or worse re-uses run_in_threadpool), it impacts dependencies as well.

This is the current behavior:

from typing import List
from contextvars import ContextVar

from starlette.applications import Starlette
from starlette.routing import Route
from starlette.requests import Request
from starlette.responses import Response
from starlette.middleware import Middleware
from starlette.testclient import TestClient
from starlette.types import ASGIApp, Scope, Receive, Send, Message

user: ContextVar[str] = ContextVar("user")

log: List[str] = []


class LogUserMiddleware:
    def __init__(self, app: ASGIApp) -> None:
        self.app = app
    
    async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
        async def send_wrapper(message: Message) -> None:
            if message["type"] == "http.response.start":
                log.append(user.get())
            await send(message)
        await self.app(scope, receive, send_wrapper)


def endpoint_sync(request: Request) -> Response:
    user.set(request.query_params["user"])
    return Response()


async def endpoint_async(request: Request) -> Response:
    return endpoint_sync(request)


app = Starlette(
    routes=[
        Route("/sync", endpoint_sync),
        Route("/async", endpoint_async),
    ],
    middleware=[Middleware(LogUserMiddleware)]
)


client = TestClient(app, raise_server_exceptions=False)


resp = client.get("/async", params={"user": "adrian"})
assert resp.status_code == 200, resp.content
assert log == ["adrian"]

resp = client.get("/sync", params={"user": "adrian"})
assert resp.status_code == 500, resp.content

I wouldn't expect a user to write this themselves, but some library they are using very well could use contextvars to set something that then gets picked up by a middleware (for example, some sort of tracing library). Since this is an implementation detail, and the failure mode is pretty tricky, it would be difficult for a user to figure out what happened.

adriangb avatar Aug 04 '21 04:08 adriangb

Accidental close. Sorry.

I merged in master, made the extra method private and cleaned up tests / coverage. Tests and linting are passing locally now.

adriangb avatar Dec 07 '21 15:12 adriangb

This should have been fixed by anyio 3.4.0: https://anyio.readthedocs.io/en/stable/versionhistory.html

Can you confirm? @adriangb

Kludex avatar Dec 16 '21 17:12 Kludex

It seems like they implemented context propagation, but not capturing of changes in the context. You can verified this by taking this branch, replacing starlette/concurrency.py with the version from master and seeing that the new tests fail.

In other words, we can probably remove the contextvar stuff here: https://github.com/encode/starlette/blob/a11dbe0e9ff75d518f7ab957594ba4adb9dc69e9/starlette/concurrency.py#L32-L35. But that should probably be a separate PR.

adriangb avatar Dec 16 '21 17:12 adriangb

Are there any cons of doing this? Are we going to break anything with this? Is this really necessary? Is it possible to do what the PR proposes without adding it to Starlette itself, but a Starlette user can do it on their own?

Kludex avatar Apr 24 '22 06:04 Kludex

The last question there is the important one for me. The answer is yes and no:

  1. If users are relying on Starlette (or FastAPI) to do the wrapping of sync functions in a thread, then no they can't implement this since they have no control over it.
  2. If they are doing the wrapping themselves, then yes, but it's super clunky and unlikely that an end user would want to do it.

Ultimately I think what should happen is that Starlette deprecates auto wrapping of sync functions.

adriangb avatar Apr 24 '22 06:04 adriangb

It's a pretty highly upvoted issue.

I think we should either move forward with this or officially state that we don't plan to support this, maybe even that we want to deprecate auto-wrapping of things and run_in_threadpool itself in the future.

I personally prefer the latter, even if I'm the author of this PR.

adriangb avatar May 06 '22 06:05 adriangb

I've asked about this on anyio's Gitter channel: https://gitter.im/python-trio/AnyIO?at=628b22247df86c141e9b566a

As Gitter history is deleted after some time, I'll paste the conversation for the record:

Conversation screenshots image image

Kludex avatar May 23 '22 06:05 Kludex

@agronholm opened a discussion on discuss.python.org: https://discuss.python.org/t/back-propagation-of-contextvar-changes-from-worker-threads/15928

adriangb avatar May 23 '22 19:05 adriangb

Looks like in 3.11 we might be able to use this for async tasks as well: https://github.com/python/cpython/pull/31837

adriangb avatar May 26 '22 15:05 adriangb

Looks like in 3.11 we might be able to use this for async tasks as well: python/cpython#31837

Can you clarify? I'm not sure how this addition will help Starlette.

agronholm avatar May 26 '22 15:05 agronholm

We have a similar issue with BaseHTTPMiddleware: https://github.com/encode/starlette/blob/master/tests/middleware/test_base.py#L192-L223

I also think that in that case it's better to push people away from using it instead of trying to fix it, but we may be able to also try to fix it to make the situation better for users that can't / won't drop BaseHTTPMiddleware.

So in theory in 3.11 I think we should be able to do:

ctx = copy_context()
with anyio.create_task_group() as tg:
    tg.start_soon(..., context=ctx)   # assuming you add this to anyio
_restore_context(ctx)  # this hack

adriangb avatar May 26 '22 15:05 adriangb

Not sure if this is appropriate to post here, but I am interested in this PR for exactly the use case @adriangb mentions in the OP. I am trying to implement AWS X-Ray tracing SDK into a FastAPI application but am blocked because the SDK stores the traces in the thread's context, which is lost when the controller endpoint thread is ran in the threadpool.

hs-sean-grant avatar Jun 14 '22 00:06 hs-sean-grant

Yup that is appropriate! User feedback is always appreciated. Could you try this branch and see if it solves things for you?

adriangb avatar Jun 14 '22 00:06 adriangb

Quite relevant: https://github.com/django/asgiref/issues/267

So there are some edge cases with this implementation. In Django this is a problem because there are two different threads, in Starlette we just have one, so that usage doesn't make sense in Starlette and hence I don't think we'd run into this issue. Besides, it is currently also broken, this makes it no better or worse.

adriangb avatar Jun 26 '22 18:06 adriangb

@graingert would you mind sharing why we shouldn't go with this? Just to have the record here. ๐Ÿ˜Ÿ ๐Ÿ™

Kludex avatar Sep 03 '22 07:09 Kludex

I do not know what @graingert would say, but I would also opine that, on the balance of it, restoring contextvars is not a game that we should play, as the only guarantee given by contextvars is that it will flow downstream - there is no guarantee about either the scope of the context (is it the process? A HTTP request? Something else?), or that any variable set will flow upstream to the calling function. Implementing upstream contextvars propagation in generic functions is, in my opinion, likely to bite us in the a** later.

In other words, I currently think that โ€œsetting a context variable so that it can be picked up by upstream codeโ€ is a misuse of the contextvars API.

I believe that for the issues highlighted here, alternative solutions can be implemented, among which:

  • setting a mutable object in a context variable near the level where the variable will be read;
  • using explicit context.run in places where someone wants __enter__ and __exit__ to share a context;
  • If we think we should do it (with the appropriate thinking about race conditions), starlette could provide some kind of explicitly request-scoped context;

jhominal avatar Sep 03 '22 11:09 jhominal

as the only guarantee given by contextvars is that it will flow downstream

I think the issue is that for a FastAPI user this is downstream:

def dep1() -> None:
    ctx.set("foo")

def dep2(d1: None = Depends()) -> None:
    ctx.get()

This should be modeled as dep2(dep1()) -> dep2 is downstream of dep1 and should see changes to context variables. But they don't because executing dep1 in threads means that any context variables set are erased. I know that we all probably understand why that is and think that the current behavior makes sense, but for FastAPI users who don't understand these things (and perhaps aren't even interacting with context variables directly) the behavior will seem wrong.

Implementing upstream contextvars propagation in generic functions is, in my opinion, likely to bite us in the a** later

FWIW I don't disagree with this. I have a gut feeling that there will be some issues with implementing this, even if it's been in asgiref for years and hasn't caused any issues.

If we think we should do it (with the appropriate thinking about race conditions), starlette could provide some kind of explicitly request-scoped context

This won't solve things for users that are just using a library that uses context vars, but I think providing an explicit context would be great. Flask does it. This would allow FastAPI to manipulate the context as it sees fit to make it work with the dependency injection system. Starlette could use this context to make propagating data from lifespans to requests less akward/not use globals. I already proposed this in asgiref: https://github.com/django/asgiref/issues/322

adriangb avatar Sep 03 '22 16:09 adriangb

I think the issue is that for a FastAPI user this is downstream:

def dep1() -> None:
    ctx.set("foo")

def dep2(d1: None = Depends()) -> None:
    ctx.get()

This should be modeled as dep2(dep1()) -> dep2 is downstream of dep1 and should see changes to context variables. But they don't because executing dep1 in threads means that any context variables set are erased. I know that we all probably understand why that is and think that the current behavior makes sense, but for FastAPI users who don't understand these things (and perhaps aren't even interacting with context variables directly) the behavior will seem wrong.

The definition of what I mean by "downstream" for contextvars is quite restrictive - that is, g is downstream of f if the upstream call stack for g has f (either the real call stack in the case of synchronous calls, or the "logical call stack" in the case of asynchronous calls).

In this sense, dep2 is not downstream of dep1 - the dep1 function has long finished executing when dep2 is called.

However, I would rather avoid in-depth discussion of how to solve FastAPI's dependency issue in this repository - FastAPI's dependency system, and, in particular, any runtime/execution guarantees that FastAPI wishes to provide to its users, need to be part of FastAPI's feature set.

In this case, I sincerely believe that it is possible for FastAPI to use the contextvars.Context.run function to provide the guarantees that FastAPI wishes to provide to its users.

In order to enable FastAPI to support these cases more easily, I would propose that run_in_threadpool gets a new kwarg argument of type contextvars.Context, that will use the context.run method to ensure that the synchronous function being run is run in the provided context instance.

This won't solve things for users that are just using a library that uses context vars, but I think providing an explicit context would be great. Flask does it. This would allow FastAPI to manipulate the context as it sees fit to make it work with the dependency injection system. Starlette could use this context to make propagating data from lifespans to requests less akward/not use globals. I already proposed this in asgiref: https://github.com/django/asgiref/issues/322

I already knew that. :)

jhominal avatar Sep 03 '22 23:09 jhominal

I would rather avoid in-depth discussion of how to solve FastAPI's dependency issue in this repository - FastAPI's dependency system, and, in particular, any runtime/execution guarantees that FastAPI wishes to provide to its users, need to be part of FastAPI's feature set

I do agree on this ๐Ÿ˜„, I think FastAPI should not be using these functions from Starlette (I think I even remember Tom saying we should have made these private from the get-go).

Implementing upstream contextvars propagation in generic functions is, in my opinion, likely to bite us in the a** later

This is something I keep thinking about. Although this has been in asgiref for a long time and as far as I can remember from all of these discussions no one has pointed out a concrete issue with the idea or implementation, I can't help but agree with you that it just feels like there's going to be some edge cases if we go down this route.

To be honest I'm not enthused with this solution, it's not a problem I personally deal with and there doesn't seem to be much support amongst our team for it. I'm going to close the PR for now and if things change in the future we can re-evaluate.

adriangb avatar Sep 05 '22 23:09 adriangb