bittensor icon indicating copy to clipboard operation
bittensor copied to clipboard

bittensor/axon.py: thread and exception handling

Open mvds00 opened this issue 1 year ago • 1 comments

Bug

Various issues were encountered trying to run and understand e2e tests:

  • if uvicorn fails to start, an uncaught exception is emitted to stderr
  • axon keeps spinning, waiting for self.started, indefinitely
  • exceptions are not propagated from threads
  • there is no way to (simply) test from the outside whether an axon started and/or runs
  • axon creates a thread that only creates another thread, which seems redundant

Description of the Change

This patch addresses some of these issues, in FastAPIThreadedServer:

  • add thread safe set/get_exception() to set/get exceptions
  • run_in_thread() yields the created thread, so that the code using it can check whether the thread is alive
  • uvicorn.Server.startup() is wrapped to set a thread-safe flag using self.set_started(True) to indicate startup succeeded
  • run_in_thread() times out after one second to prevent infinite loop in case self.get_started() never becomes True
  • run_in_thread() raises an exception if it fails to start the thread
  • _wrapper_run() tests whether the thread is still alive

and in class axon, the following are added:

  • @property axon.exception(), returning any exception
  • axon.is_running(), returning True when the axon is operational

The seemingly redundant thread is left in until feedback is received on the reasons for including it.

Alternate Designs

It is a deliberate choice to add status calls on class axon, and not expect the user instantiating an axon to look into implementation details such as axon.fast_server, even though they are not explicitly marked private by the naming convention used.

Possible Drawbacks

Not expected. Perhaps issues that are currently buried will suddenly pop up. This is then intended and an expected effect of improving error handling.

Verification Process

These changes are part of an effort to improve e2e tests, and as such they are part of various other developments and tested along with other changes.

Release Notes

N/A

mvds00 avatar Aug 13 '24 15:08 mvds00