PBFT: Leader eviction is too harsh
Introduction
A node will choose a new leader once the following happens:
- A timeout is reached after no new blocks were committed, and
- Right after the timeout is reached, the pool contains one or more transaction(s)
In pseudo-code:
for {
select {
case <- time.After(timeout):
if pool.Len() != 0 {
// trigger a view change
}
case <- s.newBlocks:
// ok, continue
}
}
Choosing a new leader is a heavy operation that should be triggered only when the situation truly requires it.
The current process can be a problem for the following reason:
- In the unfortunate situation where a transaction is added to the pool right after the timeout: even if the leader is honest, it might just process the transaction too late.
- This check only ensures that some transactions are included periodically, but doesn't prevent the leader from censoring specific transactions.
I won't try to solve 2) for the moment. Let's focus on 1), which can be stated as follow: "a view change should be triggered only once a node finds out that the the pool of transaction, if not empty, hasn't be updated after some timeout".
Solution 1 - Add a second timeout
Once the timeout is reached, wait on another timeout and check if some transaction were processed in the meantime:
for {
select {
case <- time.After(timeout):
n := pool.Len()
if n != 0 {
// gives some time to the leader to process txs
time.Sleep(timeout2)
if pool.Len() < n {
// ok
return nil
}
// view change
}
case <- s.newBlocks:
// ok, continue
}
}
Pros:
- simple to implement
Cons:
- there can still be a false negative, in case the leader processed one transaction, but another one got included
Solution 2 - Update the pool implementation
Adds a functionality to the pool that can notify for "rotten transaction".
for {
select {
case <- pool.RottenTx
// view change
}
case <- s.newBlocks:
// ok, continue
}
}
Pros:
- solves 1) and 2)
- no false positives
- simplifies the processor
Cons:
- requires a significant update on the pool implementation
Meeting notes , 12/12/2022
Participants:
@jbsv , @nkcr , @pierluca
File involved:
https://github.com/dedis/dela/blob/master/core/ordering/cosipbft/mod.go#L485 https://github.com/dedis/dela/blob/master/core/ordering/cosipbft/mod_test.go#L97
Possible solution
- It might make sense to keep the view change criteria outside of the pool implementation and as part of the ordering implementation
- Pool implementation could return statistics
- Ordering implementation acts on those statistics, e.g. applies some condition based on oldest transaction timestamp / number of transactions / etc.
Other considerations: Pool interface
- It might make sense to apply back-pressure as part of the Pool implementation, so that not too many transactions can be submitted at once
- Error types / codes could be specified so that a client can distinguish between transient and permanent failures