asgiref tweak lifespan spec to require propagating exceptions thrown from lifespan tasks after `'lifespan.startup.complete'`

currently in quart-trio a failure in the app.nursery is swallowed and the web server continues unabated

https://gitlab.com/pgjones/quart-trio/-/issues/21

once an ASGI app acknowledges a lifespan request (with a lifespan.startup.complete event) the server should treat failures in that task as fatal.

Note: Now that Starlette supports trio, this is currently making writing a Starlette TaskGroup middleware a challenge

Jun 30 '21 16:06 graingert

some other ideas:

A new event

Perhaps there should be an additional event - 'lifespan.startup.acknowledge' which can be sent at any time that signals acceptance of the lifespan scope and that any failures are fatal

A sentinel BaseException

or there could be a:

class LifespanFailure(BaseException):
   pass

and lifespan tasks could:

try:
    with anyio.create_task_group() as app.tg:
        ...
except Exception as e:
    raise asgiref.LifespanFailure from e

New asgi servers could extract the .__cause__ attribute of the LifespanFailure and raise that instead. this would be backwards compatible because current asgi servers propagate BaseExceptions already

Jun 30 '21 16:06 graingert

I do agree something needs to happen here - it's important to have explicit failure - but not a huge fan of the acknowledge event model, that just seems a little much, along with the weird exception handling, so propagating exceptions after complete seems reasonable?

Jun 30 '21 18:06 andrewgodwin

I'd like to be able to propagate exceptions before startup complete too, but that's a bigger spec change. Eg if my task is cancelled I won't be able to send "lifespan.startup.failed"

Jun 30 '21 18:06 graingert

So the specific behaviour you would want to change is that the ASGI server swallows any exceptions during startup:

If an exception is raised when calling the application callable with a lifespan.startup message or a scope with type lifespan, the server must continue but not send any lifespan events.

To have a specific exception that does stop the server? I think we can very reasonably support that in a backwards-compatible way, then.

Jun 30 '21 18:06 andrewgodwin

starlette 0.16.0 is now slightly violating the asgi spec here: https://github.com/encode/starlette/blob/e45c5793611bfec606cd9c5d56887ddc67f77c38/starlette/routing.py#L628 and will send "lifespan.shutdown.complete" or "lifespan.shutdown.failed" regardless of how await receive() completes.

Eg if await receive() is completed with any value starlette will reply with "lifespan.shutdown.complete".

importantly: if await receive() throws a CancelledError - if that cancellation was initiated by a lifespan_context TaskGroup or CancelScope starlette will reply with "lifespan.shutdown.complete" even through no shutdown has actually been initiated. additionally if that cancellation was initiated by an exception being thrown into a lifespan_context TaskGroup, starlette propagates it with "lifespan.shutdown.failed"

Jul 19 '21 07:07 graingert

here's an issue where this spec violation bites:

https://gitlab.com/pgjones/hypercorn/-/merge_requests/56/diffs

Jul 27 '21 08:07 graingert

I've been looking into this issue recently for Litestar and it would be great to continue the conversation.

Like others, we have a lifespan context manager that is entered on "lifespan.startup", and exited on "lifespan.shutdown". User's can do anything within this context, including spinning up asynchronous tasks. We even document it for that purpose (ref):

This can be useful when dealing with long running tasks

and starlette seems to promote it for similar (ref):

Consider using anyio.create_task_group() for managing asynchronous tasks.

If an exception is raised within that context but after we have sent "lifespan.startup.complete", should we raise it, emit "lifespan.shutdown.failed", or catch it, log it and keep the lifespan task running? We currently send "lifespan.shutdown.failed" and then re-raise the exception, however servers continue to serve after this (tested this against uvicorn, granian, hypercorn).

Apr 19 '24 06:04 peterschutt

asgiref asgiref copied to clipboard

tweak lifespan spec to require propagating exceptions thrown from lifespan tasks after `'lifespan.startup.complete'`

A new event

A sentinel BaseException

asgiref
asgiref copied to clipboard