bittensor-subnet-template icon indicating copy to clipboard operation
bittensor-subnet-template copied to clipboard

miner.py: implement checks and error handling for failing axons

Open mvds00 opened this issue 1 year ago • 0 comments

The miner template seemingly assumes that starting an axon never fails. In e2e tests (that use this template as their miner) the axon failed to start, due to mixing next_asyncio in bittensor and regular asyncio in uvicorn. It would be better to terminate the miner process when the axon never starts at all.

This patch addresses this by:

  • wrapping run() in a try/except (this is a must in any Python threading application)
  • signalling exceptions from worker to main thread in a thread safe manner
  • terminating the miner if starting the axon fails
  • monitoring and reporting on whether the axon still runs

Whether to keep the miner running if axon issues arise later is another question; the code as-is indicates this is indented behavior: "# In case of unforeseen errors, the miner will log the error and continue operations." so this is not changed.

This patch depends on another patch to bittensor that adds .is_running() and .exception to class axon.

mvds00 avatar Aug 13 '24 16:08 mvds00