cashews icon indicating copy to clipboard operation
cashews copied to clipboard

LookupError when using cache decorator across async/thread boundaries in 7.4.2

Open tcinbis opened this issue 3 months ago • 12 comments

Cashews 7.4.2 raises LookupError: <ContextVar name='template_context'> when cache decorators are used in code that crosses async/thread boundaries via asyncio.run_in_executor(). Commit a47f8ce changed the ContextVar initialization.

#Before (working)
_template_context: ContextVar[dict[str, Any]] = ContextVar("template_context", default={_REWRITE:
False})

# After (broken)
_template_context: ContextVar[dict[str, Any]] = ContextVar("template_context")
_template_context.set({_REWRITE: False})

I think this breaks because:

  • ContextVar(..., default=value) provides a fallback for .get() in all threads/contexts
  • ContextVar.set(value) only sets the value in the current context - new threads don't inherit it
  • When run_in_executor() spawns a new thread, it has no context value and no default → LookupError

Here is a minimal script to reproduce the issue (provided under the MIT license):

import asyncio
from concurrent.futures import ThreadPoolExecutor
from cashews import cache

cache.setup("mem://")

@cache(ttl=60, key="item:{item_id}")
async def get_item(item_id: int):
    return f"item_{item_id}"

def sync_get_item(item_id: int):
    # Sync wrapper that calls async code
    return asyncio.run(get_item(item_id))

async def main():
    loop = asyncio.get_event_loop()
    with ThreadPoolExecutor() as executor:
        # This raises LookupError in 7.4.2
        result = await loop.run_in_executor(executor, sync_get_item, 123)
        print(result)

asyncio.run(main())

tcinbis avatar Oct 10 '25 09:10 tcinbis

We are seing this as well.

kthy avatar Oct 10 '25 10:10 kthy

Hello, thanks for reporting this issue. I thought that threads will inherit context value of a main thread (via ContextVar.set )

I’ll thy to fix it ASAP, stay tuned

Krukov avatar Oct 10 '25 21:10 Krukov

Thinking of solutions to this, should template_context be a singleton with thread safe ops? There should only be one instance for all event loops, threads ect?

SoundsSerious avatar Oct 10 '25 21:10 SoundsSerious

template_context is already a "thread safe" singleton (thread/coro local) I moved the default value for template_context because of RUFF. After upgrading, ruff noticed that the default value can be changed, so I moved it from initialization to SET, not knowing it's not the same thing/behavior

I didn't look, but it turns out it's in the ruff documentation already have some right solutions https://docs.astral.sh/ruff/rules/mutable-contextvar-default/
I mean the solution is to handle default value differently

Krukov avatar Oct 11 '25 21:10 Krukov

Should be fine on 7.4.3, Please help me verify it )

Krukov avatar Oct 11 '25 21:10 Krukov

I am still seeing the error on 7.4.3. 🙁

If there is any debug output that would be helpful for you, please let me know.

kthy avatar Oct 14 '25 08:10 kthy

Does anyone know what a safe version of the library to roll back to is?

In leu of a fix, I am wondering if we can encapsulate the use of this complex state introspection with a flag or other key-lookup input condition? This ContextVar feature seems to be state introspective in an interesting but potentially dangerous way. I feel if it would be safer if we could isolate this from the main path of cache access?

SoundsSerious avatar Oct 15 '25 04:10 SoundsSerious

@SoundsSerious we're happy on 7.4.1. (We're also happy on 7.4.3 as long as we use a SQLite disk cache instead of memory.)

kthy avatar Oct 17 '25 07:10 kthy

I am still seeing the error on 7.4.3. 🙁

If there is any debug output that would be helpful for you, please let me know.

@kthy Please share an output ( traceback or something ) )

Krukov avatar Oct 18 '25 13:10 Krukov

The raw traceback output I'm getting is almost 1700 lines, so I've tried paring it down (as well as removing various PII and proprietary info) and posted it here: https://gist.github.com/kthy/29ccacfb00741ef8eccd1e212938444a

This is a FastAPI application using structlog as a logging middleware, which fiddles with contextvars:

from structlog.contextvars import bind_contextvars, clear_contextvars

class LoggingMiddleware:
    async def __call__(
        self, request: Request, call_next: Callable[[Request], Awaitable[Response]]
    ) -> Response:
        clear_contextvars()
        # These context vars will be added to all log entries emitted during the request
        request_id = correlation_id.get()
        bind_contextvars(request_id=request_id)

        start_time = perf_counter_ns()
        # If the call_next raises an error, we still want to return our own 500 response,
        # so we can add headers to it (process time, request ID...)
        response = Response(status_code=500)
        try:
            response = await call_next(request)
        except Exception:
            error("Uncaught exception", exc_info=1)
            raise
        finally:
            process_time = perf_counter_ns() - start_time
            status_code = response.status_code
            # Ref. <https://github.com/encode/starlette/discussions/1513#discussion-3876765>
            url = get_path_with_query_string(request.scope)  # type: ignore[arg-type]
            client_host = maybe(request.client).host
            client_port = maybe(request.client).port
            http_method = request.method
            http_version = request.scope["http_version"]
            # Recreate the Uvicorn access log format, but add all parameters as structured information
            get_logger("api.access").info(
                (
                    f"""{client_host}:{client_port} - "{http_method} {url} """
                    f"""HTTP/{http_version}" {status_code}"""
                ),
                http={
                    "url": str(request.url),
                    "status_code": status_code,
                    "method": http_method,
                    "request_id": request_id,
                    "version": http_version,
                },
                network={"client": {"ip": client_host, "port": client_port}},
                duration=process_time,
            )
            response.headers["X-Process-Time"] = str(process_time / 10**9)
            # Intentional return in finally to ensure response returned even if error occurs
            return response  # noqa: B012

The weirdest thing is that this happens when cashews is disabled, but we can run the app with a SQLite diskcache just fine. 🤷‍♂️

(All this code worked fine with 7.4.1.)

kthy avatar Oct 20 '25 14:10 kthy

Actually, looking at that wall of text I think you just need to scroll to the bottom for the important part:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/wrapper/disa  │
│ ble_control.py:54 in is_full_disable                                                             │
│                                                                                                  │
│   51 │                                                                                           │
│   52 │   @property                                                                               │
│   53 │   def is_full_disable(self) -> bool:                                                      │
│ ❱ 54 │   │   return all(backend.is_full_disable for backend in self._backends.values())          │
│   55                                                                                             │
│                                                                                                  │
│ ╭──────────────────────── locals ─────────────────────────╮                                      │
│ │ self = <cashews.wrapper.Cache object at 0x73428feb6930> │                                      │
│ ╰─────────────────────────────────────────────────────────╯                                      │
│                                                                                                  │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/wrapper/disa  │
│ ble_control.py:54 in <genexpr>                                                                   │
│                                                                                                  │
│   51 │                                                                                           │
│   52 │   @property                                                                               │
│   53 │   def is_full_disable(self) -> bool:                                                      │
│ ❱ 54 │   │   return all(backend.is_full_disable for backend in self._backends.values())          │
│   55                                                                                             │
│                                                                                                  │
│ ╭────────────────────────────── locals ───────────────────────────────╮                          │
│ │      .0 = <dict_valueiterator object at 0x73421af03010>             │                          │
│ │ backend = <cashews.backends.memory.Memory object at 0x73421d0b0110> │                          │
│ ╰─────────────────────────────────────────────────────────────────────╯                          │
│                                                                                                  │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/backends/int  │
│ erface.py:214 in is_full_disable                                                                 │
│                                                                                                  │
│   211 │   def is_full_disable(self) -> bool:                                                     │
│   212 │   │   if not self._control_set:                                                          │
│   213 │   │   │   return not self.enable_by_default                                              │
│ ❱ 214 │   │   return self._disable == ALL                                                        │
│   215 │                                                                                          │
│   216 │   def disable(self, *cmds: Command) -> None:                                             │
│   217 │   │   if not cmds:                                                                       │
│                                                                                                  │
│ ╭───────────────────────────── locals ─────────────────────────────╮                             │
│ │ self = <cashews.backends.memory.Memory object at 0x73421d0b0110> │                             │
│ ╰──────────────────────────────────────────────────────────────────╯                             │
│                                                                                                  │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/backends/int  │
│ erface.py:190 in _disable                                                                        │
│                                                                                                  │
│   187 │                                                                                          │
│   188 │   @property                                                                              │
│   189 │   def _disable(self) -> set[Command]:                                                    │
│ ❱ 190 │   │   return self.__disable.get()                                                        │
│   191 │                                                                                          │
│   192 │   def _set_disable(self, value: set[Command]) -> None:                                   │
│   193 │   │   self.__disable.set(value)                                                          │
│                                                                                                  │
│ ╭───────────────────────────── locals ─────────────────────────────╮                             │
│ │ self = <cashews.backends.memory.Memory object at 0x73421d0b0110> │                             │
│ ╰──────────────────────────────────────────────────────────────────╯                             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
LookupError: <ContextVar name='126727792296208' at 0x73421d076d90>

kthy avatar Oct 20 '25 14:10 kthy

Hope it fixed now (7.4.4) , can somebody confirm it?

Krukov avatar Dec 06 '25 22:12 Krukov

I can confirm that 7.4.4 fixes the issue for us. Thanks!

kthy avatar Dec 08 '25 08:12 kthy

@kthy @Krukov Just want to report that I observed this on version 7.4.1 which was previously considered "fine" , I will try 7.4.4

I am thinking that the ContextVar design is not a good one since it violates a couple rules in the zen of python. I appreciate that this is a cool bit of code but it seems prone to problems, especially if you have async code changing context all of the time. Is there a way that we could allow ContextVar through composition?

Explicit is better than implicit.
Simple is better than complex.
Flat is better than nested.
In the face of ambiguity, refuse the temptation to guess.

SoundsSerious avatar Dec 16 '25 17:12 SoundsSerious