rusk
rusk copied to clipboard
node: Support the acceptance of an emergency block
Summary
In case not even the emergency mode allows the network to produce a block, the very last iteration (255) is reserved for a Dusk-signed block that is accepted without any quorum. This block will be added to the local chain as ACCEPTED, meaning that it can be replaced by a lower-iteration block, if it exists and reached a quorum. This mechanism ensures that a block is produced even if the network collapses or gets split. from #1171
Possible solution design or implementation
Implementation-wise, it should be doable to support a master key (Dusk Block Generator) in the Genesis state. Once a candidate block is signed by this specific Dusk Block Generator then the block acceptance procedure will accept a block without checking for 2/3 quorum. This approach implies some sort of centralization.
Another approach could be:
Add a master key (Dusk Block Generator) in the Genesis state. This key is allowed to propagate a vote only for a candidate of iteration 255. On collecting this particular vote, a committee member count it for MAX_WEIGHT=43 so it's enough to reach full quorum for iteration 255. Less centralized solution.
Additional context
The feature was initially requested in #1171 but as it does not touch dusk_consensus
crate, it could be addressed separately.
EDIT
We agreed on the solution described in this comment.
FYI @fed-franz @herr-seppia What is your opinion on the second approach?
Another approach could be:
Add a master key (Dusk Block Generator) in the Genesis state. This key is allowed to propagate a vote only for a candidate of iteration 255. On collecting this particular vote, a committee member count it for MAX_WEIGHT=43 so it's enough to reach full quorum for iteration 255. Less centralized solution.
FYI @fed-franz @herr-seppia What is your opinion on the second approach?
What if that generator is missing?
Another solution could be:
- Propose an empty[^1] block from DUSK with Iteration=255
- Collect votes from 50%+1 by all stakers
[^1]: In future we can even slash generators who didn't produce block for past iterations, so to mix up the committee
FYI @fed-franz @herr-seppia What is your opinion on the second approach?
What if that generator is missing?
Another solution could be:
- Propose an empty1 block from DUSK with Iteration=255
- Collect votes from 50%+1 by all stakers
Footnotes
- In future we can even slash generators who didn't produce block for past iterations, so to mix up the committee ↩
By stakers do you mean committee members or any eligible provisioners can vote for this emergency block
? If so, then 50% of what? Maybe we can incorporate minimum number of votes as a requirement in that case.
FYI @fed-franz @herr-seppia What is your opinion on the second approach?
Another approach could be: Add a master key (Dusk Block Generator) in the Genesis state. This key is allowed to propagate a vote only for a candidate of iteration 255. On collecting this particular vote, a committee member count it for MAX_WEIGHT=43 so it's enough to reach full quorum for iteration 255. Less centralized solution.
Both approaches seem similar to me: in the first approach, a Dusk-owned node creates a valid-without-a-quorum block, while in the second one, the Dusk-owned node casts a full-quorum vote. The drawback of the second approach is that we need the selected generator to be online, which at iteration 255 is too risky to assume.
The whole point of the emergency block is that, if we reached iteration 255, most provisioners are offline, or we are eclipsed, or the network is partitioned.
I think a better solution in terms of centralization could be to allow all provisioners to produce the Emergency Block, which would be empty and not signed. When a provisioner reaches iteration 255 without a winning block, it creates the Emergency Block. Being empty and not signed, all (online) provisioners will produce the same block, so no conflict would occur between Emergency Blocks created by different provisioners. When the network recovers, if a block was produced and reached a quorum, it will replace the emergency block; if instead the majority of provisioners were offline/eclipsed, and no quorum was reached, the emergency block will be kept in the chain.
Does this make sense to you?
If we allow any provisioner to produce an Emergency block, then wouldn't this create a new type of an attack vector? An adversary that has staked the minimum amount may produce an emergency block at any round impacting the network negatively. Additionally, emergency block can be used in DDoS attack as it will be always considered valid for re-broadcast.
@fed-franz
If we allow any provisioner to produce an Emergency block, then wouldn't this create a new type of an attack vector? An adversary that has staked the minimum amount may produce an emergency block at any round impacting the network negatively.
The emergency block should only be accepted by a Provisioner when reaching Iteration 255. Nodes that are not running the consensus protocol can accept the block and wait for the next one. Since Emergency Blocks are empty they cannot make the node believe anything (like a transaction or event) that didn't happen. Moreover Emergency blocks are always Accepted, so they would be replaced if a legit block is received for the same height.
Additionally, emergency block can be used in DDoS attack as it will be always considered valid for re-broadcast.
Only one Emergency Block can exist per round, and the block must still be a valid successor of the current tip, so a single block is not a problem. However, the attacker could produce a large number of consecutive Emergency Blocks, which could be problematic. This case could be easily mitigated by limiting the time between emergency blocks.
Anyway, if we want to be on the safe side, I would go for the dusk-owned nodes producing dusk-signed emergency blocks. It's a bit of a centralized solution, but it ensures no attack vector is introduced.
BTW, if the emergency block is signed by Dusk, we need some way to put the signature somewhere in the block. At the moment, since we don't include the Generator's signature in the block, nor in the Block message, there is no way to verify that the emergency block came from a Dusk node...
I can think of three solutions:
- we include the generator signature somewhere in the block (e.g. in the Block structure, or in the Certificate)
- we include the generator signature in the Block message (like we have in the Candidate message)
- we use the Certificate and put only the Dusk signature (in the Validation or Ratification StepVotes)
I guess we could also use the Seed field, since it's signed by the block generator.
After discussion, we agreed on the following solution:
- The last iteration (50) is reserved for an Emergency Block created by a Dusk-owned node
- When reaching this iteration, each provisioner broadcast a specific (signed) message to indicate it reached the last iteration. This message is essentially a request to Dusk for the Emergency Block (we can call this the Emergency Block Request (EBR) message)
- Dusk nodes collect EBR messages until the cumulative stake of the senders surpasses the 50% of the total stake. When this occurs, it produces a block that includes:
- no transactions
- attestations of failed iterations (which can be used to slash missed block producers)
- the aggregated EBR signatures (as a proof that a majority of stakers requested the emergency block)
- No Quorum is required to accept this block
NOTES
- The Emergency block has always an Accepted label, which means it can be replaced by a lower-iteration block that reached a Valid Quorum
- One of the reasons for having a Dusk-signed block instead of an unsigned empty block is the ability to include NoCandidate and Invalid attestations which are necessary to slash provisioners missing their block. (Hard-)Slashing these provisioners might be essential to prevent the same situation (e.g. many top-stakers are offline) from occurring again.
- Waiting for EBR messages not only conditions the emergency block on an explicit "delegation" process by a majority of provisioners (effectively eliminating the centralization issue that a Dusk-signed block has), but also allows provisioners to naturally synchronize on the last iteration: since the EBR message is sent when reaching the last iteration, when the Emergency Block is produced, a majority of provisioners will be (stuck) at the same round/iteration, which in turn will allow all such provisioners to accept the Emergency Block at the same time (and start the next round at the same time).
- One of the motivations for requiring an EBR from a majority of provisioners is that we only "move forward" if a large portion of provisioners are indeed online. Conversely, if most provisioners are offline for some reason it makes more sense to keep waiting until the are back online.
To take into account: https://github.com/dusk-network/rusk/pull/1873#discussion_r1658842420