HotShot
HotShot copied to clipboard
General task architecture refactor
Closes #3208
The general goal of this PR is to make it possible to halt tasks, retrieve their state and restart them.
I think this would be nice for a number of reasons, but in particular it would allow us to very easily upgrade NodeType
at runtime.
This PR:
- Simplifies the
TaskState
trait: implementations must now only specify an associatedEvent
type, thehandle_event
function and acancel_subtasks
function to cleanup any spawned subtasks. - Tasks that receive their events from the network have had their
TaskState
implementations removed, and will be subsumed under a newNetworkTaskState
in a future PR. - Each task returns its state as a boxed
TaskState
object via aJoinHandle
. TheseJoinHandle
s are collected in aConsensusTaskRegistry
. - Network tasks return a
JoinHandle<()>
are collected in aNetworkTaskRegistry
, and will be dealt with in a future PR. - Test tasks no longer implement
TaskState
, because they fundamentally do not listen on a single receiver in the same way that consensus tasks do. -
TestTaskState
requires acheck()
function that returns the pass/fail result of the test.
Minor:
- The
RwLock
andArc
indirection in task registries is removed, and these now simply contain aVec
ofJoinHandle
s. As a consequence,SystemContextHandle
is not clonable; this is a mildly dangerous operation anyway, since a HotShot handle is a unique object and cloning it does not create a new one, and makes ownership harder to reason about, especially around mutable operations likeshut_down
. An application that wishes to hold multiple handles should put the handle it receives into aArc<RwLock<...>>
and synchronize calls to the handle itself. -
shut_down
is no longer aBoxSynFuture
, because shutting down the consensus registry is notSync
(it must await on theJoinHandle
s it holds) - Various tasks no longer receive a complete cloned
SystemContextHandle
; they only needed 1-2 fields, so we simply pass these directly to avoid additional the additionalConsensusApi
constraint.
This PR does not:
There should not be any external, observable changes in runtime behavior.
Key places to review:
This PR is quite large because of the scope of the changes, but the vast majority of it is rewiring. I think the most significant intentional changes are in:
-
task/src/task.rs
: the main trait changes. I believe it is easier to simply read the final file, rather than the diff. -
testing/src/test_task.rs
: the test task changes. Note thatTestTask
and theTestTaskState
trait were previously located intask/src/task.rs
, but these were essentially rewritten from scratch. But it's definitely worth looking at other files if you can. If there's anything you suspect might be an unintentional or incorrect logic change, please flag it.