hbbft Examine `Broadcast` for optimization potential.

Examine `Broadcast` for optimization potential.

Open mbr opened this issue 5 years ago • 31 comments

While debugging an issue with failing tests, I encountered some inefficiency in the reliable broadcast algorithm, at least inside the edge-case of a two node network. Consider the following partial log:

B: [0] -> [1]: HoneyBadger(0, Message { epoch: 0, content: Subset(Broadcast(0, Value(Proof { #1, root_hash: c002..beaa, value: b808..6d14, .. }))) })
D: [0] -> [1]: HoneyBadger(0, Message { epoch: 0, content: Subset(Broadcast(0, Echo(Proof { #0, root_hash: c002..beaa, value: 0000..3a39, .. }))) })
E: [1] -> [0]: HoneyBadger(0, Message { epoch: 0, content: Subset(Broadcast(0, Echo(Proof { #1, root_hash: c002..beaa, value: b808..6d14, .. }))) })
G: [1] -> [0]: HoneyBadger(0, Message { epoch: 0, content: Subset(Broadcast(0, Ready(c002..beaa))) })
J: [0] -> [1]: HoneyBadger(0, Message { epoch: 0, content: Subset(Broadcast(0, Ready(c002..beaa))) })

This is almost in line with the algorithm description on page 7; we are sending VAL(h, b_i, s_i) to every node in the network and expect an ECHO(h, b_i, s_i) back for each, finally a READY(h). The full text-book message log would look like this:

A: [0->0] VAL(h, b_0, s_0)
B: [0->1] VAL(h, b_1, s_1)
C: [0->0] ECHO(h, b_0, s_0)
D: [0->1] ECHO(h, b_0, s_0)
E: [1->0] ECHO(h, b_1, s_1)
F: [1->1] ECHO(h, b_1, s_1)
G: [1->0] READY(h)
H: [1->1] READY(h)
I: [0->0] READY(h)
J: [0->1] READY(h)

This is twice as many messages, because A, C, F, H and I are messages that need not leave the node and are short-circuited inside the code already (instead of returning it an Step output, it is directly handled by calling the respective handle_ function).

Possible optimizations

There might be some more room for optimization here; consider message E. In this case, node 1 is echoing back what node 0 sent in its VAL message. This message contains no new information for node 0, it merely functions as an acknowledgement. The resilience gained here is minimal, a smart attacker that controls a faulty node can always reply with a correct ECHO message to avoid suspicion.

In general, it might be possible to omit the ECHO back to the sender of the VAL that triggered it, resulting in the following sequence:

B: [0->1] VAL(h, b_1, s_1)
D: [0->1] ECHO(h, b_0, s_0)
G: [1->0] READY(h)
J: [0->1] READY(h)

Some other message back to the original node could potentially be dropped, but this a bigger trade-off: We are trading efficiency (less messages) for a weaker "non-malicious" error detection; the messages can be checked to ensure nodes that meant well are up, but will not deter a corrupted node that is actively working to deceive the network.

Gains

The gains per removed message in a network of N nodes are modest for large networks: Instead of sending N-1 + N(N-1) + N(N-1) messages (VAR, ECHO, READY), we would be sending N-1 + (N-1)(N-1) + N(N-1) messages; at 25 nodes this is a 2% reduction per broadcast, smaller networks benefit much more (e.g. around 9% with 5 nodes). Broadcast messages account for about 2/3rds of all messages (correct me if I am wrong), also the ECHO message carries a substantial payload (as opposed to READY).

I'd like some feedback on these observations, I don't know if implementing this potential optimization is worthwhile right now, but it is something to keep in mind for the future. More important is verifying that no cryptographic properties are altered by this modification; if that is the case, we can keep this issue around as a warning ;).

Oct 29 '18 18:10 mbr

Good point! I think that all messages back to the proposer are redundant: We might as well have the proposer output right away. (I'm not convinced there's much to gain in terms of error detection.) In general, there is a lot of redundancy, some of which could potentially be optimized away in more optimistic scenarios. And maybe the above is also just a special case of that:

The problem is that Echo has two purposes:

Give the recipient their chunk, so they can decode the value.
Let the recipient know that we have that chunk.

The algorithm waits for N - f Echos, f of which only serve the second purpose: we only need N - 2 f to decode. Moreover, in the optimistic case, every node receives N chunks instead of just the N - 2 f it needs. (~66% overhead!)

We could introduce two additional kinds of messages, meaning:

EchoHash(h): "I have my chunk with root hash h and can provide it on request."
CanDecode(h): "I have received N - 2 f chunks with root hash h, so I'll be able to decode."

Every node could multicast CanDecode(h) as soon as it has received N - 2 f Echos. If we have received CanDecode(h) from a peer, we would always send them EchoHash instead, where the current algorithm sends the full Echo. And if we have received both CanDecode and Ready from someone, we know that they have already terminated, and we don't need to send anything to them anymore.

Depending on how synchronous the nodes are, and how much their bandwidth differs, this could:

Speed things up a lot, because after the first N - 2 f nodes have sent their Echos, everyone would multicast CanDecode and no further full Echos would be sent.
Make no difference because everyone starts by sending all their Echos and only receives CanDecode afterwards.

But at least the proposer could send CanDecode and Ready right away (i.e. the Value message could be interpreted as those two, too), so that nobody would send messages back to the proposer.

Finally, we could optimize even more for the optimistic scenario. Given a "pessimism factor" g (configurable option—how many faulty nodes we want to optimize for), only the N - 2 f + g nodes "to my left" send me the full Echo, the others only send EchoHash right away.

Upon receiving N - f Echos plus EchoHashes in total, I multicast Ready.
Upon receiving N - 2 f full Echos, I multicast CanDecode.
Upon receiving 2 f + 1 Readys, I send full Echos to all those nodes from whom I haven't received CanDecode yet.

Edit: This would also require special handling of observers, I guess… But instead of keeping track of every observer from inside the algorithm, we could just introduce a special Target::Observers. Then the N - f nodes "to the left" of the proposer could, before terminating, recompute their own chunk (if necessary) and send an Echo and Ready to them. That would save at least ~33% in message complexity, since at least f nodes would not have to send their redundant chunks to them.

Oct 30 '18 09:10 afck

@afck, this is a great optimisation. Those N - 2 f full Echos should be valid leaf values in order for decode_from_shards to compute a value. We won't have any Reed-Solomon redundancy but as you wrote, we may not need it?

Oct 30 '18 10:10 vkomenda

Exactly: In the optimistic case, we want to avoid the unnecessary redundancy. Of course we never want to compromise on security and asynchronicity! But I think my proposal wouldn't: It would be asynchronous and still work for up to f faulty nodes; but if there are more than g faulty nodes, it would require one additional message round (sending the missing full Echos).

Oct 30 '18 14:10 afck

Do you mean if there are more than f rather than g faulty nodes?

Oct 30 '18 14:10 vkomenda

Yes, exactly: If there are more than g faulty nodes among the N - 2 f + g ones to your left, you will receive fewer than N - 2 f full Echos, so you will have to wait for another correct node to ~terminate~ (collect 2 f + 1 Readys and) send you their Echo.

Oct 30 '18 15:10 afck

Note that even with g = f, this would use less bandwidth than the current approach, and you'd never have the added waiting period unless the sender is faulty, in which case we don't care about latency anyway.

Oct 30 '18 15:10 afck

In the discussions with the team, I realized that there are several (partly independent) optimization ideas floating around. I'll try to separate them as much as possible. First of all, with the module documentation's notation, where the proof p[i] is the triple (h, b[i], s[i]) of the Merkle tree root h, i-th branch b[i] and i-th leaf/value s[i], the current algorithm works like this:

The proposer sends Value(p[i]) to validator i.
Upon receiving Value(p[i]) from the proposer, multicast Echo(p[i]).
When we have f + 1 Readys or N - f Echos with root hash h, multicast Ready(h).
When we have 2 f + 1 Readys and N - 2 f Echos with root hash h, decode and output.

Note that our implementation already optimizes away all messages that nodes conceptually send to themselves. I'll assume implicitly that we always do that, to keep things simple. So any message sent to ourselves is zero-cost.

Also, every correct node sends at most one message of every kind to every other node. Conversely, all but the first message of every kind we receive from a peer are ignored.

Optimization 1: We don't need that many `Echo`s.

To decode, we only need N - 2 f Echos. To know that everyone will be able to decode, we thus need N - f peers to tell us that they'll provide their Echo to everyone who needs it. Let g be a "pessimism factor". Imagine the nodes sitting in a circle, ordered by their IDs. Start counting at yourself at 1 and count the nodes to your left until you reach N - 2 f + g: Those are the nodes that I'll call "to your left". This optimization introduces a CanDecode(h) message ("I have enough information to decode the value.") and an EchoHash(h) ("I can send you Echo(h, _, _) if needed.") message:

The proposer sends Value(p[i]) to validator i.
Upon receiving Value(p[i]) from the proposer, send Echo(p[i]) to all nodes to your left, and EchoHash(h) to everyone else.
When we have f + 1 Readys, or N - f Echos plus EchoHashes in total with root hash h, multicast Ready(h).
When we have N - 2 f Echos with root hash h, send CanDecode(h) to everyone who hasn't sent us their full Echo yet.
When we have 2 f + 1 Readys, send our full Echo to every peer who's not to our left and hasn't sent us CanDecode(h).
When we have 2 f + 1 Readys and N - 2 f Echos with root hash h, decode and output.

With no faulty nodes and g = 0, this should use almost 66% less bandwidth and output after no more message rounds than the original algorithm. With g = f and f faulty nodes, it should use almost 33% less bandwidth and also use no more message rounds. With g = 2 f, it is essentially the original algorithm, plus the CanDecode(h) messages.

Correctness Proof

We have to prove three things: :one: If a correct node outputs a value, no correct node outputs a different value. :two: If a correct node outputs a value, every other correct node does so, too. :three: If the proposer is correct, every correct nodes outputs the value they proposed.

Let's use "output h" as an abbreviation for "output the value with root hash h and terminate". Let's call the number of nodes from which we have received either Echo(h, ..) or EchoHash(h) the "h-echo count".

First, assume that some correct node sends Ready(h). Then there was a first correct node who sent Ready(h). At that point, it can have received at most f Ready(h) itself (from the faulty nodes), so it must have sent it due to the other criterion: its h-echo count was ≥ N - f. Therefore:

:star: If a correct node sends Ready(h), then at least one correct node must have had an h-echo count ≥ N - f.

If a correct node has an h0-echo count ≥ N - f and a correct node (possibly the same) has an h1-echo count ≥ N - f, then the intersection of the senders of those Echos and EchoHashes must have size ≥ f + 1. So at least one correct node sent an Echo/EchoHash with h0 and h1. Therefore, h0 == h1. It follows from :star: that:

:spades: All Ready messages sent by correct nodes carry the same hash.

Conversely, assume f + 1 correct nodes send Ready(h). Then every correct node will send Ready(h) itself. Therefore, every node will receive at least N - f ≥ 2 f + 1 of them:

:heart: If f + 1 correct nodes send Ready(h), every node will receive 2 f + 1 Ready(h).

Also, if f + 1 correct nodes send Ready(h), :star: implies that at least N - 2 f correct nodes sent an Echo/EchoHash with h. Due to :heart:, those nodes will eventually receive 2 f + 1 Ready(h) messages, so they will send the full Echo(h, ..) to everyone who hasn't sent them a CanDecode(h). Since correct nodes only send CanDecode(h) upon receiving N - 2 f Echo(h, ..), that means every correct node will receive N - 2 f Echo(h, ..). Together with :heart:, this implies:

:large_blue_diamond: If f + 1 correct nodes send Ready(h), every correct node will output h.

If a correct node outputs h, it has received 2 f + 1 Ready(h), so at least f + 1 correct nodes have sent Ready(h). With :spades:, this implies :one:. With :large_blue_diamond: it implies :two:.

Finally, if the proposer is correct and proposes a value with root hash h, all nodes receive Value(h, ..), so at least N - f nodes send Echo(h, ..) or EchoHash(h), so everyone will have an h-echo count of at least N - f ≥ 2 f + 1. So every correct node will send Ready(h). Due to :large_blue_diamond:, that implies that every correct node will output h, which is point :three:.

Optimization 2: The proposer doesn't need your `Echo`s!

The proposer already has the value! Nobody should send them any Echos. With opimization 1, however, they only sent their own Echo to some of the peers, so they need to wait a bit (the natural choice is to also wait for 2 f + 1 Readys, like everyone else) and then send full Echos to the peers who can't decode yet. So:

The proposer in addition sends CanDecode and Ready before sending Value. (Or rather: Value counts as CanDecode+Ready+Value.)
The proposer pretends to have received the Echo for every Value it sent (i.e. handles the Echo as if it had received it).

Optimization 3: Terminated nodes don't need anything.

To prevent sending unnecessary messages to nodes that have already output and terminated, we could send a Term message before terminating. Peers would then stop sending any messages to them at all. (Not sure whether that's worth the added complexity: A terminated peer has already sent their CanDecode, so all messages that will still be sent to them are tiny.)

Nov 06 '18 10:11 afck

Looks good to me, so far. I'll scrap the V1 write-up in favor of this, I don't think we need to write it up any more formal?

Nov 13 '18 14:11 mbr

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

This issue now has a funding of 250.0 DAI (250.0 USD @ $1.0/DAI) attached to it as part of the poanetwork fund.

If you would like to work on this issue you can 'start work' on the Gitcoin Issue Details page.
Want to chip in? Add your own contribution here.
Questions? Checkout Gitcoin Help or the Gitcoin Slack
$55,163.75 more funded OSS Work available on the Gitcoin Issue Explorer

Apr 02 '19 15:04 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

Work has been started.

These users each claimed they can complete the work by 9 months, 2 weeks from now. Please review their action plans below:

1) pawanjay176 has been approved to start work.

Aim to tackle mainly optimizations 1 and 2 in the github discussion comment by afck. Plan:

Add the extra CanDecode and EchoHash messages in the Messages enum.
Have pessimism factor g as a configurable parameter in the Broadcast struct.
Change handle_value to prevent echoing a message back to proposer. Would save (N-1) messages regardless of g (original idea of the issue).
Write handle_echo_hash and handle_can_decode functions and modify other handle functions accordingly.
Test for expected reduction in number of messages with different values of g.
Not sure if additional testing is required after implementation of proposed optimizations. Want to discuss this point further at some point.

Let me know if the plan makes sense. Cheers!

Learn more on the Gitcoin Issue Details page.

Apr 15 '19 06:04 gitcoinbot

Sounds good to me! :+1: The challenge will be to add all those optimizations without making the code unreasonably complex. Please also double-check the proof and update the module's documentation. It's important that the code and algorithm are easy to read, understand and audit.

Apr 16 '19 07:04 afck

@pawanjay176 Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 21 '19 16:04 gitcoinbot

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 21 '19 16:04 gitcoinbot

@afck What is the point of the second check here in the handle_echo function?

if self.ready_sent || self.count_echos(&hash) < self.netinfo.num_correct() {
    return self.compute_output(&hash);
}

Is it just to have a Step::default() as the output of the invocation of the handle_echo till we send a Ready?

Apr 24 '19 10:04 pawanjay176

You mean the self.count_echos(&hash) < self.netinfo.num_correct() part? We mustn't send a Ready unless we have collected N - f Echos, so we need to return a Step without a Ready message here, yes. I think the reason why we don't just return Step::default() here instead of self.compute_output(&hash) is that we might already have collected N - f Readys and the current one could be the _f + 1_st Echo, so we would be ready to output.

Apr 24 '19 19:04 afck

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 27 '19 16:04 gitcoinbot

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 27 '19 16:04 gitcoinbot

I fixed what I think was an error in my comment above: We should only call N - 2 f + g nodes "to your left" instead of N - f + g, since we only need N - 2 f Echos. (Correct me if I'm wrong!)

Apr 30 '19 07:04 afck

Note that by "N - f Echos plus EchoHashes in total" above I mean that N - f nodes have sent us an Echo or EchoHash or both—sorry for the confusion! (See https://github.com/poanetwork/hbbft/pull/405#discussion_r279660943.)

Also, I think a node needs to be allowed (and required, in some cases with a faulty proposer) to send multiple CanDecode messages with different hashes. (See https://github.com/poanetwork/hbbft/pull/405#discussion_r279653811.)

Apr 30 '19 08:04 afck

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

@pawanjay176 due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

[x] reminder (3 days)
[x] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 30 '19 16:04 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

@pawanjay176 due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

[x] reminder (3 days)
[x] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

Apr 30 '19 16:04 gitcoinbot

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

May 03 '19 16:05 gitcoinbot

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

May 03 '19 16:05 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

@pawanjay176 due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

[x] reminder (3 days)
[x] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

May 06 '19 16:05 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

@pawanjay176 due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

[x] reminder (3 days)
[x] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

May 06 '19 16:05 gitcoinbot

[x] reminder (3 days)
[ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

May 30 '19 16:05 gitcoinbot

@pawanjay176: Please claim your reward from gitcoinbot.

Jun 06 '19 09:06 afck

Issue Status: 1. Open 2. Started 3. Submitted 4. Done

Work for 250.0 DAI (250.0 USD @ $1.0/DAI) has been submitted by:

@pawanjay176

@igorbarinov please take a look at the submitted work:

PR by @pawanjay176

Learn more on the Gitcoin Issue Details page
Want to chip in? Add your own contribution here.
Questions? Checkout Gitcoin Help or the Gitcoin Slack
$90,258.70 more funded OSS Work available on the Gitcoin Issue Explorer

Jun 13 '19 04:06 gitcoinbot

⚡️ A tip worth 50.00000 DAI (50.0 USD @ $1.0/DAI) has been granted to @pawanjay176 for this issue from @. ⚡️

Nice work @pawanjay176! Your tip has automatically been deposited in the ETH address we have on file.

$86159.73 in Funded OSS Work Available at: https://gitcoin.co/explorer
Incentivize contributions to your repo: Send a Tip or Fund a PR
No Email? Get help on the Gitcoin Slack

Jun 15 '19 22:06 gitcoinbot

⚡️ A tip worth 50.00000 DAI (50.0 USD @ $1.0/DAI) has been granted to @pawanjay176 for this issue from @. ⚡️

Nice work @pawanjay176! Your tip has automatically been deposited in the ETH address we have on file.

$86159.73 in Funded OSS Work Available at: https://gitcoin.co/explorer
Incentivize contributions to your repo: Send a Tip or Fund a PR
No Email? Get help on the Gitcoin Slack

Jun 15 '19 22:06 gitcoinbot

hbbft hbbft copied to clipboard

Examine `Broadcast` for optimization potential.

Possible optimizations

Gains

Optimization 1: We don't need that many Echos.

Correctness Proof

Optimization 2: The proposer doesn't need your Echos!

Optimization 3: Terminated nodes don't need anything.

hbbft
hbbft copied to clipboard

Optimization 1: We don't need that many `Echo`s.

Optimization 2: The proposer doesn't need your `Echo`s!