HotShot icon indicating copy to clipboard operation
HotShot copied to clipboard

[CX_HARDENING] - Evaluate Usages of `Consensus` Global Shared State

Open jparr721 opened this issue 1 year ago • 0 comments

What is this task and why do we need to work on it?

The Consensus shared state object is utilized in a number of locations to keep track of details relevant to the proposal and voting process within HotShot. This shared state, however, is a bit overburdened and has encountered scenarios where we must be judicious about the length of time that we're holding locks, and also we must be cautious about the order of operations due to the delays updating this state.

It is most sensible to evaluate the usages of the storage methods within the consensus global shared state and, if feasible, this task should also include the removal of the consensus shared state entirely, and rely on the consuming tasks to maintain their own maps. This should be relatively straightforward.

What work will need to be done to complete this task?

We will evaluate based on the following criteria:

  • [ ] Evaluate all locations that the shared state is updated.
  • [ ] Make events for each of the shared state updates that do not yet already exist.
    • [ ] Evaluate Saved DA Certs
    • [ ] Evaluate Validated State Map
      • This one will likely remain
    • [ ] Evaluate Vid Shares
      • This one might be able to move to just the DA task
    • [ ] Evaluate cur view
      • This one is interesting. Cur view is only used in 3 locations
        • task-impls/src/da.rs
        • types/handle.rs
        • hotshot/src/lib.rs
      • This is interesting because all of the other tasks have a cur_view field that they use to independently keep track. We should make an evaluation as to whether cur_view should be completely centralized, or, completely decentralized.
    • [ ] Check last_decided_view, saved_leaves, and locked_view, but these will likely remain.
    • [ ] Evaluate saved_payloads
      • This might be able to be handled just in the DA task
    • [ ] high_qc and metrics will remain
  • [ ] Create per-task states for only the required events.
  • [ ] If separation of fields occurs where garbage collection boundaries are changed, create a new CollectGarbage(TYPES::TIME) event that triggers a gc for the other tasks.

Are there any other details to include?

One potential hiccup within this process is data accessibility. While accessing and writing to the shared state should be faster due to the lack of lock contention for write access, there's a chance still that we still may have a situation in which not all tasks have updated by the time a particular event occurs, but these instances should be rare as the order of events should be consistent.

What are the acceptance criteria to close this issue?

Consensus shared state is limited in scope. to the extent it can be

Branch work will be merged to (if not the default branch)

No response

jparr721 avatar May 23 '24 16:05 jparr721