aiorun icon indicating copy to clipboard operation
aiorun copied to clipboard

How to safely make sure that main() exit stops the loop?

Open sersorrel opened this issue 4 months ago • 3 comments

Some context: I use aiorun.run(main(), stop_on_unhandled_errors=True) to make sure that main crashing will exit the program, because any exception in main probably indicates a programmer error (and a service manager can restart the process once it exits).

Unfortunately, main being cancelled (e.g. because something it awaits was cancelled) is not treated as an "unhandled error", even though from my perspective it's unexpected behaviour and I want the process to exit and be restarted, rather than sitting there doing nothing. So, my main looks like this:

async def main():
	try:
		await business_logic()
	finally:
		asyncio.get_running_loop().stop()

in an effort to ensure that, no matter what, if business_logic doesn't work properly, at least the program will exit.

However, this causes a new problem – when I ctrl-C the app, it prints an ugly traceback:

Traceback (most recent call last):
  File "/home/ubuntu/app/.venv/bin/app", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/app/src/app/server/__init__.py", line 535, in main
    aiorun.run(async_main(), stop_on_unhandled_errors=True)
  File "/home/ubuntu/app/.venv/lib/python3.10/site-packages/aiorun.py", line 352, in run
    loop.run_until_complete(
  File "/usr/lib/python3.10/asyncio/base_events.py", line 647, in run_until_complete
    raise RuntimeError('Event loop stopped before Future completed.')
RuntimeError: Event loop stopped before Future completed.

I think the flow of events here is something like:

  • I hit ctrl-C
  • aiorun's _shutdown_handler runs (with a short diversion via loop.call_soon_threadsafe) and calls loop.stop()
  • pending tasks are polled by asyncio to see if they can make progress (that is, this iteration of the event loop runs to completion)
  • the loop actually stops, and aiorun's "shutdown phase" begins
  • pending tasks are cancelled by aiorun, to signal that they should tidy up and exit
  • aiorun starts up the loop again, and waits for all the pending (cancelled) tasks
  • one of those tasks is main, which hits its finally clause and calls loop.stop()!
  • this iteration of the event loop runs to completion
  • the loop actually stops, again – but there's a task that didn't complete (perhaps it had more cleanup work to do than main, and would've completed given a few more event loop iterations)
  • ugly traceback

Is there a way I can solve this? How can main tell whether it's been cancelled because it was waiting on something that got cancelled (i.e. it must stop the loop) vs because aiorun detected a ctrl-C and is running cancelled tasks to completion (in which case aiorun will stop the loop)?

sersorrel avatar Oct 15 '24 09:10 sersorrel