di
di copied to clipboard
feat: automatic scope inference
General idea: in most cases, there are a couple dependencies with a well defined scope (for example, an HTTP request scoped to a "request"
scope). For the rest, users don't care what the scope is and in fact explicitly declaring the scope is cumbersome and error prone (for example, having to make sure that a dependency is always declared with the same scope. This removes that burden from users by letting the framework figure out the scope for each dependency. The main rule is: assign the highest valid scope. So in the web framework example, we by default assign the "app"
scope unless the dependency depends on an HTTP request, in which case it will get the "request"
scope.
Codecov Report
Merging #77 (503b4a9) into main (bea5ace) will decrease coverage by
0.03%
. The diff coverage is96.42%
.
@@ Coverage Diff @@
## main #77 +/- ##
==========================================
- Coverage 97.23% 97.20% -0.04%
==========================================
Files 70 71 +1
Lines 2131 2179 +48
Branches 397 409 +12
==========================================
+ Hits 2072 2118 +46
- Misses 51 52 +1
- Partials 8 9 +1
Impacted Files | Coverage Δ | |
---|---|---|
docs_src/default_scope.py | 94.44% <94.44%> (ø) |
|
di/_utils/types.py | 100.00% <100.00%> (ø) |
|
di/api/scopes.py | 100.00% <100.00%> (ø) |
|
di/container/_container.py | 100.00% <100.00%> (ø) |
|
di/container/_solving.py | 100.00% <100.00%> (ø) |
|
tests/docs/test_scopes_example.py | 100.00% <100.00%> (ø) |
One issue / breaking change with this proposal is that in Xpresso dependencies get the "connection"
scope by default and now they would get the "app"
scope (unless they depend on an HTTP request object). I think it is pretty common to have:
def endpoint(db: Annotated[Connection, Depends(get_db_connection)]):
...
Where users would expect db
to have a "connection"
scope, but with this change they'd have to specify that scope manually. The main issues with this are:
- FastAPI compatibility
- Intuitiveness for users. In some situations the
"request"
scope is safer.
On the other hand, Spring Boot (arguably the most widely used DI framework out there) uses "singleton"
as it's default scope: https://www.tutorialspoint.com/spring/spring_bean_scopes.htm
After a bit of mulling over, this would be a bad change as currently posed.
The main case that demonstrates this is defining a database connection:
async def get_db() -> Any:
...
DBConnection = Annotated[Any, Marker(get_db)]
async def endpoint(db: DBConnection) -> None:
...
Here get_db
does not depend on the Request
object or anything like that, so it would be assigned the "app"
scope, which means every single request would share the same database connection! This is really bad default behavior and would be terrible to debug.
So, despite what SpringBoot does, it makes a lot more sense to follow what FastAPI does and make the "default" scope bound to the Request
.
But we can still use some of the ideas here: we can add a parameter default_scope: Optional[Scope] = None
to Container.solve
so that this can be set on a per-DAG basis. So then when solving lifespans/startup events the default scope can be "app"
while for endpoints it can be "connection"
.
Startup event before:
async def lifespan(conn: Annotated[Connection, Marker(get_connection, scope="app")]) -> AsyncIterator[None]:
yield
scope="app"
is necessary just so that Config
gets the same scope as lifespan
itself.
With this proposal:
async def lifespan(conn: Annotated[Connection, Marker(get_connection)]) -> AsyncIterator[None]:
yield
Now we no longer need that boilerplate!
Endpoints would mostly stay the same, but also get a bit better.
In Xpresso we have 2 Request scopes:
-
"connection"
: teardown is run after the response is sent and the connection is closed -
"endpoint"
: teardown is run right after the endpoint, before sending the response to the client
The "endpoint"
scope is good for things that might fail during teardown and need to be propagated to the client (e.g. committing a database transaction).
But currently if you have something "endpoint"
scoped you have to declare everything that depends on it with the "endpoint"
scope!
def get_transaction() -> AsyncIterator[None]:
yield
Transaction = Annotated[None, Marker(get_transaction, scope="endpoint")]
@dataclass
class Repo:
transaction: Transaction
def endpoint(repo: Annotated[Repo, Marker(scope="endpoint")]) -> None:
...
With this change this would become:
def get_transaction() -> AsyncIterator[None]:
yield
Transaction = Annotated[None, Marker(get_transaction, scope="endpoint")]
@dataclass
class Repo:
transaction: Transaction
def endpoint(repo: Repo) -> None:
...
The outermost valid scope for Repo
is "endpoint"
, so it gets that automatically instead of forcing the user to specify it manually.