trio icon indicating copy to clipboard operation
trio copied to clipboard

AsyncExitStack doesn't preserve open_nursery's behaviour around __context__

Open dphoyes opened this issue 4 years ago • 13 comments

Consider this program:

import trio
import contextlib

async def background():
    assert False

async def main():
    async with contextlib.AsyncExitStack() as stack:
        n = await stack.enter_async_context(trio.open_nursery())
        n.start_soon(background)
        await trio.sleep_forever()

trio.run(main)

We get an AssertionError as expected, but the traceback looks like this:

Traceback (most recent call last):
  File "/Users/dphoyes/devel/experiment.py", line 11, in main
    await trio.sleep_forever()
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/trio/_timeouts.py", line 40, in sleep_forever
    await trio.lowlevel.wait_task_rescheduled(lambda _: trio.lowlevel.Abort.SUCCEEDED)
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/trio/_core/_traps.py", line 166, in wait_task_rescheduled
    return (await _async_yield(WaitTaskRescheduled(abort_func))).unwrap()
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/outcome/_impl.py", line 138, in unwrap
    raise captured_error
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/trio/_core/_run.py", line 1172, in raise_cancel
    raise Cancelled._create()
trio.Cancelled: Cancelled

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/dphoyes/devel/experiment.py", line 13, in <module>
    trio.run(main)
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/trio/_core/_run.py", line 1932, in run
    raise runner.main_task_outcome.error
  File "/Users/dphoyes/devel/experiment.py", line 11, in main
    await trio.sleep_forever()
  File "/opt/homebrew/Cellar/[email protected]/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 651, in __aexit__
    raise exc_details[1]
  File "/opt/homebrew/Cellar/[email protected]/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 634, in __aexit__
    cb_suppress = await cb(*exc_details)
  File "/Users/dphoyes/devel/venv/lib/python3.9/site-packages/trio/_core/_run.py", line 815, in __aexit__
    raise combined_error_from_nursery
  File "/Users/dphoyes/devel/experiment.py", line 5, in background
    assert False
AssertionError

In other words, the AssertionError's __context__ is being set to that of the Cancelled exception. If we swap the use of AsyncExitStack out for the equivalent async with trio.open_nursery() as n:, we get the much simpler traceback that we'd expect.

This is admittedly a relatively benign issue, and it's not clear to me whether it's actually Trio at fault here or contextlib. In my real app though, there's enough nesting of cancellations to make this turn a 46 line traceback into a 42227 line one, which makes debugging a bit "interesting". I've worked around it by eliminating all usage of AsyncExitStack, but I thought it worth sharing anyway.

dphoyes avatar May 08 '21 12:05 dphoyes

Uh, it may be benign but it's really blowing up our tracebacks. This is a comparatively simple example, you can easily build one where the real problem is buried in a veritable avalance of cancellations.

smurfix avatar May 08 '21 15:05 smurfix

That output sucks, yeah.

I'm not sure if there's anything we can do about it, short of like, monkeypatching contextlib somehow... I think this is the line where the __context__ is getting added:

  File "/opt/homebrew/Cellar/[email protected]/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 651, in __aexit__
    raise exc_details[1]

Maybe contextlib would be up for fixing this upstream? The hack would be something like:

    saved_context = exc_details[1].__context__
    try:
        raise exc_details[1]
    finally:
        exc_details[1].__context__ = saved_context

njsmith avatar May 08 '21 18:05 njsmith

Is the next step to file something to bpo? (I don't think I could describe this well myself)

Isn't this obviously a contextlib bug-- I mean shouldn't enter_async_context() be equivalent to an explicit async with?

Maybe contextlib would be up for fixing this upstream? The hack would be something like:

what makes it a hack?

belm0 avatar Jul 08 '21 04:07 belm0

This should probably be in documentation somewhere right? It's gotten me several times when dealing with frameworks that started out as asyncio but were modified to work for trio and therefore have separate async setup and teardown methods.

I was using AsyncExitStack to manage multiple separate sub-components within this framework (quart-trio). Now it makes sense why I never once got anything helpful in a traceback and had to start putting log.exception('') in wrapped try-blocks everywhere in the sub-components.

I later avoided the setup/teardown hooks altogether and I do all my setup outside of asgi before even calling into hypercorn. Works perfectly.

parity3 avatar Jul 09 '21 16:07 parity3

I've tried opening a bpo issue: https://bugs.python.org/issue44594

belm0 avatar Jul 10 '21 00:07 belm0

   saved_context = exc_details[1].__context__
   try:
       raise exc_details[1]
   finally:
       exc_details[1].__context__ = saved_context

how about changing the existing code to raise from None?

belm0 avatar Jul 12 '21 00:07 belm0

Maybe contextlib would be up for fixing this upstream? The hack would be something like:

    saved_context = exc_details[1].__context__
    try:
        raise exc_details[1]
    finally:
        exc_details[1].__context__ = saved_context

This doesn't seem to fix the problem (at least in my naive attempt to apply it to contextlib.py)-- __context__ still ends up as Cancelled.

https://github.com/python/cpython/pull/27089

belm0 avatar Jul 12 '21 04:07 belm0

To clarify the problem case, I believe Trio is doing this:

exc = foo()
try:
    raise exc
finally:
    exc.__context__ = None

Neither ExitStack nor AsyncExcitStack are preserving the None context.

class MyException(Exception):
    pass

@contextmanager
def my_cm():
    try:
        yield
    except BaseException:
        exc = MyException()
        try:
            raise exc
        finally:
            exc.__context__ = None

print('\n=== `with` statement ===')
try:
    with my_cm():
        assert False
except BaseException as e:
    traceback.print_exc()

print('\n=== enter_context() ===')
try:
    with ExitStack() as stack:
        stack.enter_context(my_cm())
        assert False
except BaseException as e:
    traceback.print_exc()
=== `with` statement ===
Traceback (most recent call last):
  File "exit_stack_test.py", line 251, in <module>
    assert False
  File "/.../python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "exit_stack_test.py", line 244, in my_cm
    raise exc
MyException

=== enter_context() ===
Traceback (most recent call last):
  File "exit_stack_test.py", line 240, in my_cm
    yield
  File "exit_stack_test.py", line 259, in <module>
    assert False
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "exit_stack_test.py", line 259, in <module>
    assert False
  File "/.../python3.7/contextlib.py", line 524, in __exit__
    raise exc_details[1]
  File "/.../python3.7/contextlib.py", line 509, in __exit__
    if cb(*exc_details):
  File "/.../python3.7/contextlib.py", line 377, in _exit_wrapper
    return cm_exit(cm, exc_type, exc, tb)
  File "/.../python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "exit_stack_test.py", line 244, in my_cm
    raise exc
MyException

belm0 avatar Jul 12 '21 06:07 belm0

So back to this fix, it doesn't seem to work because __context__ is already not None before raising.

    saved_context = exc_details[1].__context__
    try:
        raise exc_details[1]
    finally:
        exc_details[1].__context__ = saved_context

There may be an issue with _fix_exception_context().

https://github.com/python/cpython/blob/e14d5ae5447ae28fc4828a9cee8e9007f9c30700/Lib/contextlib.py#L667-L682

And I think the existing contextlib code is already trying to override __context__:

            try:
                fixed_ctx = exc_details[1].__context__
                raise exc_details[1]
            except BaseException:
                exc_details[1].__context__ = fixed_ctx
                raise

belm0 avatar Jul 12 '21 06:07 belm0

Looking at _fix_exception_context(), there is not much wiggle room. Ultimately it's always going to overwrite __context__ of new_exc unless __context__ is already set to the old exception.

The only way out seems to modify the None handling, from:

if exc_context is old_exc:
    # Context is already set correctly (see issue 20317)
    return
if exc_context is None or exc_context is frame_exc:
    break

to

if exc_context is old_exc or exc_context is None:
    # Context is already set correctly (see issue 20317)
    return
if exc_context is frame_exc:
    break

but I'm not sure whether that change can be rationalized. I suspect that __context__ will already be filled in by the time ExitStack gets the exception, so the implementation's current handling of None is not necessary, and not handling the case they thought it was.

belm0 avatar Jul 12 '21 07:07 belm0

I'll try being histrionic:

this is destroying our developer experience, killing tooling such as aggregation of unhandled exceptions by Sentry, and undoing all the effort of https://vorpus.org/blog/beautiful-tracebacks-in-trio-v070/

cancelled

belm0 avatar Sep 30 '21 06:09 belm0

we're literally having 150,000 line backtraces in our app

belm0 avatar Oct 01 '21 05:10 belm0

contextlib fix is merged to cpython head for python 3.11, and will be backported to 3.10 and 3.9

For earlier versions (which are retired or only accepting security patches), monkeypatch of AsyncExitStack.__aexit__ or derived class like the following will work (confirmed on Python 3.8).

class FixedAsyncExitStack(AsyncExitStack):
    """
    AsyncExitStack patched for https://bugs.python.org/issue44594,
    which causes excessive tracebacks in Trio.
    """
    async def __aexit__(self, *exc_details):
        received_exc = exc_details[0] is not None

        frame_exc = sys.exc_info()[1]

        def _fix_exception_context(new_exc, old_exc):
            while 1:
                exc_context = new_exc.__context__
                if exc_context is None or exc_context is old_exc:
                    return
                if exc_context is frame_exc:
                    break
                new_exc = exc_context
            new_exc.__context__ = old_exc

        suppressed_exc = False
        pending_raise = False
        while self._exit_callbacks:
            is_sync, cb = self._exit_callbacks.pop()
            try:
                if is_sync:
                    cb_suppress = cb(*exc_details)
                else:
                    cb_suppress = await cb(*exc_details)

                if cb_suppress:
                    suppressed_exc = True
                    pending_raise = False
                    exc_details = (None, None, None)
            except:
                new_exc_details = sys.exc_info()
                _fix_exception_context(new_exc_details[1], exc_details[1])
                pending_raise = True
                exc_details = new_exc_details
        if pending_raise:
            try:
                fixed_ctx = exc_details[1].__context__
                raise exc_details[1]
            except BaseException:
                exc_details[1].__context__ = fixed_ctx
                raise
        return received_exc and suppressed_exc

belm0 avatar Oct 05 '21 05:10 belm0