databases
databases copied to clipboard
Using ContextVar to hold connections does not guarantee they are always unique to tasks.
I think there's a fairly fundamental problem with the way that database connections are managed using ContextVar.
Database.connection() is designed to return a Connection object that is unique to an asyncio Task, but because of the way that context is copied during Task creation, this is not always the case.
Consider the following code:
import asyncio
from databases import Database
database = Database("postgresql://example:password@localhost/example")
async def t1():
await database.connect()
c1 = database.connection()
async def inner():
c2 = database.connection()
assert c1 is not c2 # this fails
await asyncio.gather(inner())
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(t1())
This raises an assertion, because when gather creates a new task, it copies the current context into the newly created coroutine.
Tasks support the contextvars module. When a Task is created it copies the current context and later runs its coroutine in the copied context.
- https://docs.python.org/3/library/asyncio-task.html#task-object
This is the fundamental problem behind https://github.com/encode/databases/issues/134 and possibly several other issues. In general it makes it impossible to safely use this library in any code that uses gather or create_task.
Unfortunately I'm not sure what the best way of fixing this is. The only idea I've had so far is storing the connection as an attribute of asyncio.current_task() rather than in a ContextVar, but that feels a bit hacky.
This is (almost) a duplicate of https://github.com/encode/databases/issues/176 I posted some workarounds there. I am opening new connections in my prod and that works very well; my opinion is that this ContextVar feature is harmful and should be removed. It does not save much performance (I py-spy regularly) and leads to many problems, some of them are performance-critical.
@trvrm This is probably fixed by #328 . Please re-open if it still exists.
@aminalaee unfortunately this does not fix this issue. The sample code above still fails.
The problem is still that although the code comments in core.py say "Connections are stored as task-local state", the way that python contextvars work is that when you create a new task, the context variables from the parent task are copied to the child.
Unfortunately I don't think I have permission to re-open this issue - is that something you can do?
@trvrm Thanks, I'll take a look into it.
Thanks for the great write up @trvrm - I agree about how connections should not be inherited to child tasks, but rather new connections should be acquired. In #546, I think I have a solution. Each database instance stores a WeakKeyDictionary[asyncio.Task, databases.core.Connection] so that each task can have it's own connection, and we still get a neat bit of cleanup by using weakrefs on the tasks. Hopefully this helps address the concurrent woes that keep coming up across quite a few issues in this repo.