cbc-casper
cbc-casper copied to clipboard
Optimize forkchoice to make minimal number of decisions
Issue
NOTE: This just a thought I'm putting here so I don't forget it. It probably makes no sense
Currently, messages do not arrive in order; because messages include all the messages in their justification and these messages can be pulled from the justification as the message is received (and then these are shoved into a set), a later message may be received before an earlier message (a message may be received before it's estimate).
Because the messages form a tree for a blockchain, I believe it should be possible to maintain an topological sort of the messages. In this case, I believe it is possible to optimize the forkchoice in a variety of ways (can only store the "branch" points in the tree).
Spent some time yesterday fleshing out an algorithm with Vlad. The goal here is to have the forkchoice run on the "minimal tree" necessary. For our purposes, we will say this minimal tree contains three types of nodes: the root node (can be last_finalized_block), the latest messages, and any nodes where we have a choice (there is more than one child that has a finalized message as one of its descendants).
Here is a "tree reduction" algorithm. It takes a root, a dictionary of last_messages, and the children dictionary. It modifies the children dictionary in place. NOTE: This is just for testing, and absolutely is about the least efficient thing ever. There are tons of (extremely) low hanging fruit (as there is an absurd amount of "recalculation"), but this is the easiest to reason about that I could create for now
def reduce_tree(self, root, latest_messages, children):
new_children = dict()
to_reduce = Q.Queue()
to_reduce.put(root)
while not to_reduce.empty():
current_message = to_reduce.get()
assert current_message not in new_children
new_children[current_message] = set()
if current_message not in children:
continue
for child in children[current_message]:
critical_descendant = self.get_first_critical_descendant(
child,
latest_messages,
children
)
if critical_descendant:
new_children[current_message].add(critical_descendant)
to_reduce.put(critical_descendant)
self.update_children(root, children, new_children)
def update_children(self, root, children, new_children):
assert root in new_children
self.delete_subtree(root, children)
for message in new_children:
assert message not in children
if not any(new_children[message]):
continue
children[message] = new_children[message]
def delete_subtree(self, root, children):
if root not in children:
return
for child in children[root]:
self.delete_subtree(child, children)
del children[root]
def get_first_critical_descendant(self, message, latest_messages, children):
if message in latest_messages.values():
return message
if self.is_stressed_ancestor(message, latest_messages, children):
return message
if message not in children:
return None
critical_children = set()
for child in children[message]:
critical_children.add(
self.get_first_critical_descendant(
child, latest_messages, children
)
)
if not any(critical_children):
return None
if len(critical_children) == 1:
return critical_children.pop()
if len(critical_children) == 2:
assert None in critical_children # otherwise, should be considered a stressed ancestor
critical_children.remove(None)
return critical_children.pop()
raise Exception(
"Something is wrong! To many critical children: {}".format(critical_children)
)
def is_stressed_ancestor(self, message, latest_messages, children):
if message not in children or len(children[message]) <= 1:
return False
num_children_with_lm = 0
for child in children[message]:
if self.get_num_latest_message_descendants(child, latest_messages) > 0:
num_children_with_lm += 1
if num_children_with_lm >= 2:
return True
return False
def get_num_latest_message_descendants(self, root, latest_messages):
num_lm_descendants = 0
for message in latest_messages.values():
if root.is_in_blockchain(message):
num_lm_descendants += 1
return num_lm_descendants
We argue that this tree at the ends contains either a) the last finalized block, b) latest_messages, and c) any node where the is a "choice" (aka, a "stressed parent," as it has multiple latest children/descendants). NOTE: Currently, it does not delete anything "before the root." But this is an easy fix!
Now, we can extend this algorithm so it is sequential. That is, we can maintain this minimal tree while receiving new messages to add to the tree. Let's call this function update_minimal_tree
, where it's arguments are the new_message, latest_messages, and minimal_tree.
Phases:
- Find the "most recent" parent of the new_message that is in the tree. We will call this the
root
. - For each of the children of the
root,
get the youngest common ancestor (YCA) of the new_message and that child. - If the YCA is the root, then continue. If it's the YCA is some other intermediate node, delete the child of the root, add the intermediate node as a child of the root, and add the child and new_message as children of the intermediate node.
- If this intermediate node was not added for any child, then add this new_message as a child of the root.
- Run the get_minimal_tree function with the root being the common ancestor between the validators last_message new_message and new new_message.
NOTE: This code does not work currently!
def update_minimal_tree(minimal_tree, new_message, latest_messages):
root = new_message
while current_node not in minimal_tree:
root = root.estimate
added = False
for child in minimal_tree[root]:
common_ancestor = get_common_ancestor(child, new_message)
if common_ancestor != root:
minimal_tree[root].remove(child)
minimal_tree[root].add(common_ancestor)
minimal_tree[common_ancestor] = set()
minimal_tree[common_ancestor].add(new_message)
minimal_tree[common_ancestor].add(child)
break
if not added:
minimal_tree[root].add(new_message)
last_message_from_val = new_message.justification[new_message.sender]
common_ancestor = get_common_ancestor(new_message, last_message_from_val)
common_ancestor = common_ancestor.estimate
# old message should be removed from tree during this function call, sometimes (there are cases where it is not)!
reduce_tree(common_ancestor, latest_messages, children)
def common_ancestor(message_one, message_two):
if message_one == message_two:
return message_one
min_height = min(message_one.height, message_two.height)
if message_one.height < message_two.height:
message_two = estimate_at_height(message_two, message_one.height)
else:
message_one = estimate_at_height(message_one, message_two.height)
while True:
message_one = message_one.estimate
message_two = message_two.estimate
if message_one == message_two:
return message_one
def estimate_at_height(message, height):
assert height <= message.height and height >= 0
while message.height != height:
message = message.estimate
return message
This algorithm requires that messages arrive "in-order." Meaning, a message should always arrive after its estimate. This is necessary as the way messages are added will break otherwise (think, common ancestors get funky).