LookupError when using cache decorator across async/thread boundaries in 7.4.2
Cashews 7.4.2 raises LookupError: <ContextVar name='template_context'> when cache decorators are used in code that crosses async/thread boundaries via asyncio.run_in_executor(). Commit a47f8ce changed the ContextVar initialization.
#Before (working)
_template_context: ContextVar[dict[str, Any]] = ContextVar("template_context", default={_REWRITE:
False})
# After (broken)
_template_context: ContextVar[dict[str, Any]] = ContextVar("template_context")
_template_context.set({_REWRITE: False})
I think this breaks because:
-
ContextVar(..., default=value)provides a fallback for .get() in all threads/contexts -
ContextVar.set(value)only sets the value in the current context - new threads don't inherit it - When
run_in_executor()spawns a new thread, it has no context value and no default → LookupError
Here is a minimal script to reproduce the issue (provided under the MIT license):
import asyncio
from concurrent.futures import ThreadPoolExecutor
from cashews import cache
cache.setup("mem://")
@cache(ttl=60, key="item:{item_id}")
async def get_item(item_id: int):
return f"item_{item_id}"
def sync_get_item(item_id: int):
# Sync wrapper that calls async code
return asyncio.run(get_item(item_id))
async def main():
loop = asyncio.get_event_loop()
with ThreadPoolExecutor() as executor:
# This raises LookupError in 7.4.2
result = await loop.run_in_executor(executor, sync_get_item, 123)
print(result)
asyncio.run(main())
We are seing this as well.
Hello, thanks for reporting this issue. I thought that threads will inherit context value of a main thread (via ContextVar.set )
I’ll thy to fix it ASAP, stay tuned
Thinking of solutions to this, should template_context be a singleton with thread safe ops? There should only be one instance for all event loops, threads ect?
template_context is already a "thread safe" singleton (thread/coro local) I moved the default value for template_context because of RUFF. After upgrading, ruff noticed that the default value can be changed, so I moved it from initialization to SET, not knowing it's not the same thing/behavior
I didn't look, but it turns out it's in the ruff documentation already have some right solutions
https://docs.astral.sh/ruff/rules/mutable-contextvar-default/
I mean the solution is to handle default value differently
Should be fine on 7.4.3, Please help me verify it )
I am still seeing the error on 7.4.3. 🙁
If there is any debug output that would be helpful for you, please let me know.
Does anyone know what a safe version of the library to roll back to is?
In leu of a fix, I am wondering if we can encapsulate the use of this complex state introspection with a flag or other key-lookup input condition? This ContextVar feature seems to be state introspective in an interesting but potentially dangerous way. I feel if it would be safer if we could isolate this from the main path of cache access?
@SoundsSerious we're happy on 7.4.1. (We're also happy on 7.4.3 as long as we use a SQLite disk cache instead of memory.)
I am still seeing the error on 7.4.3. 🙁
If there is any debug output that would be helpful for you, please let me know.
@kthy Please share an output ( traceback or something ) )
The raw traceback output I'm getting is almost 1700 lines, so I've tried paring it down (as well as removing various PII and proprietary info) and posted it here: https://gist.github.com/kthy/29ccacfb00741ef8eccd1e212938444a
This is a FastAPI application using structlog as a logging middleware, which fiddles with contextvars:
from structlog.contextvars import bind_contextvars, clear_contextvars
class LoggingMiddleware:
async def __call__(
self, request: Request, call_next: Callable[[Request], Awaitable[Response]]
) -> Response:
clear_contextvars()
# These context vars will be added to all log entries emitted during the request
request_id = correlation_id.get()
bind_contextvars(request_id=request_id)
start_time = perf_counter_ns()
# If the call_next raises an error, we still want to return our own 500 response,
# so we can add headers to it (process time, request ID...)
response = Response(status_code=500)
try:
response = await call_next(request)
except Exception:
error("Uncaught exception", exc_info=1)
raise
finally:
process_time = perf_counter_ns() - start_time
status_code = response.status_code
# Ref. <https://github.com/encode/starlette/discussions/1513#discussion-3876765>
url = get_path_with_query_string(request.scope) # type: ignore[arg-type]
client_host = maybe(request.client).host
client_port = maybe(request.client).port
http_method = request.method
http_version = request.scope["http_version"]
# Recreate the Uvicorn access log format, but add all parameters as structured information
get_logger("api.access").info(
(
f"""{client_host}:{client_port} - "{http_method} {url} """
f"""HTTP/{http_version}" {status_code}"""
),
http={
"url": str(request.url),
"status_code": status_code,
"method": http_method,
"request_id": request_id,
"version": http_version,
},
network={"client": {"ip": client_host, "port": client_port}},
duration=process_time,
)
response.headers["X-Process-Time"] = str(process_time / 10**9)
# Intentional return in finally to ensure response returned even if error occurs
return response # noqa: B012
The weirdest thing is that this happens when cashews is disabled, but we can run the app with a SQLite diskcache just fine. 🤷♂️
(All this code worked fine with 7.4.1.)
Actually, looking at that wall of text I think you just need to scroll to the bottom for the important part:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/wrapper/disa │
│ ble_control.py:54 in is_full_disable │
│ │
│ 51 │ │
│ 52 │ @property │
│ 53 │ def is_full_disable(self) -> bool: │
│ ❱ 54 │ │ return all(backend.is_full_disable for backend in self._backends.values()) │
│ 55 │
│ │
│ ╭──────────────────────── locals ─────────────────────────╮ │
│ │ self = <cashews.wrapper.Cache object at 0x73428feb6930> │ │
│ ╰─────────────────────────────────────────────────────────╯ │
│ │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/wrapper/disa │
│ ble_control.py:54 in <genexpr> │
│ │
│ 51 │ │
│ 52 │ @property │
│ 53 │ def is_full_disable(self) -> bool: │
│ ❱ 54 │ │ return all(backend.is_full_disable for backend in self._backends.values()) │
│ 55 │
│ │
│ ╭────────────────────────────── locals ───────────────────────────────╮ │
│ │ .0 = <dict_valueiterator object at 0x73421af03010> │ │
│ │ backend = <cashews.backends.memory.Memory object at 0x73421d0b0110> │ │
│ ╰─────────────────────────────────────────────────────────────────────╯ │
│ │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/backends/int │
│ erface.py:214 in is_full_disable │
│ │
│ 211 │ def is_full_disable(self) -> bool: │
│ 212 │ │ if not self._control_set: │
│ 213 │ │ │ return not self.enable_by_default │
│ ❱ 214 │ │ return self._disable == ALL │
│ 215 │ │
│ 216 │ def disable(self, *cmds: Command) -> None: │
│ 217 │ │ if not cmds: │
│ │
│ ╭───────────────────────────── locals ─────────────────────────────╮ │
│ │ self = <cashews.backends.memory.Memory object at 0x73421d0b0110> │ │
│ ╰──────────────────────────────────────────────────────────────────╯ │
│ │
│ /home/kristian/src/fastapi-test-cashews/.venv/lib/python3.12/site-packages/cashews/backends/int │
│ erface.py:190 in _disable │
│ │
│ 187 │ │
│ 188 │ @property │
│ 189 │ def _disable(self) -> set[Command]: │
│ ❱ 190 │ │ return self.__disable.get() │
│ 191 │ │
│ 192 │ def _set_disable(self, value: set[Command]) -> None: │
│ 193 │ │ self.__disable.set(value) │
│ │
│ ╭───────────────────────────── locals ─────────────────────────────╮ │
│ │ self = <cashews.backends.memory.Memory object at 0x73421d0b0110> │ │
│ ╰──────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
LookupError: <ContextVar name='126727792296208' at 0x73421d076d90>
Hope it fixed now (7.4.4) , can somebody confirm it?
I can confirm that 7.4.4 fixes the issue for us. Thanks!
@kthy @Krukov Just want to report that I observed this on version 7.4.1 which was previously considered "fine" , I will try 7.4.4
I am thinking that the ContextVar design is not a good one since it violates a couple rules in the zen of python. I appreciate that this is a cool bit of code but it seems prone to problems, especially if you have async code changing context all of the time. Is there a way that we could allow ContextVar through composition?
Explicit is better than implicit.
Simple is better than complex.
Flat is better than nested.
In the face of ambiguity, refuse the temptation to guess.