dela icon indicating copy to clipboard operation
dela copied to clipboard

PBFT: Leader eviction is too harsh

Open nkcr opened this issue 3 years ago • 1 comments

Introduction

A node will choose a new leader once the following happens:

  1. A timeout is reached after no new blocks were committed, and
  2. Right after the timeout is reached, the pool contains one or more transaction(s)

In pseudo-code:

for {
	select {
		case <- time.After(timeout):
		if pool.Len() != 0 {
		  // trigger a view change
		}
		case <- s.newBlocks:
		// ok, continue
	}
}

Choosing a new leader is a heavy operation that should be triggered only when the situation truly requires it.

The current process can be a problem for the following reason:

  1. In the unfortunate situation where a transaction is added to the pool right after the timeout: even if the leader is honest, it might just process the transaction too late.
  2. This check only ensures that some transactions are included periodically, but doesn't prevent the leader from censoring specific transactions.

I won't try to solve 2) for the moment. Let's focus on 1), which can be stated as follow: "a view change should be triggered only once a node finds out that the the pool of transaction, if not empty, hasn't be updated after some timeout".

Solution 1 - Add a second timeout

Once the timeout is reached, wait on another timeout and check if some transaction were processed in the meantime:

for {
	select {
		case <- time.After(timeout):
		n := pool.Len()
		if n != 0 {
			// gives some time to the leader to process txs
			time.Sleep(timeout2)

			if pool.Len() < n {
				// ok
				return nil
			}

			// view change
		}
		case <- s.newBlocks:
		// ok, continue
	}
}

Pros:

  • simple to implement

Cons:

  • there can still be a false negative, in case the leader processed one transaction, but another one got included

Solution 2 - Update the pool implementation

Adds a functionality to the pool that can notify for "rotten transaction".

for {
	select {
		case <- pool.RottenTx
			// view change
		}
		case <- s.newBlocks:
		// ok, continue
	}
}

Pros:

  • solves 1) and 2)
  • no false positives
  • simplifies the processor

Cons:

  • requires a significant update on the pool implementation

nkcr avatar Jun 30 '22 14:06 nkcr

Meeting notes , 12/12/2022

Participants:

@jbsv , @nkcr , @pierluca

File involved:

https://github.com/dedis/dela/blob/master/core/ordering/cosipbft/mod.go#L485 https://github.com/dedis/dela/blob/master/core/ordering/cosipbft/mod_test.go#L97

Possible solution

  • It might make sense to keep the view change criteria outside of the pool implementation and as part of the ordering implementation
  • Pool implementation could return statistics
  • Ordering implementation acts on those statistics, e.g. applies some condition based on oldest transaction timestamp / number of transactions / etc.

Other considerations: Pool interface

  • It might make sense to apply back-pressure as part of the Pool implementation, so that not too many transactions can be submitted at once
  • Error types / codes could be specified so that a client can distinguish between transient and permanent failures

pierluca avatar Dec 12 '22 10:12 pierluca