firewood icon indicating copy to clipboard operation
firewood copied to clipboard

Look into the best way to run a callback on the parent node when traversing the trie

Open richardpringle opened this issue 2 years ago • 3 comments

from #380

Even better than that comment, I can't imagine the case where we are using both callbacks, so this is probably an enum:

enum Callback {
    None,
    Infix(FnMut(ObjRef<'a>, u8),
    PostFix(FnMut(ObjRef<'a>, u8)
};
fn get_node_by_key_with_callback<'a, K: AsRef<[u8]>>(
        &'a self,
        mut node_ref: ObjRef<'a>,
        key: K,
        callback: Callback,

Originally posted by @rkuris in https://github.com/ava-labs/firewood/pull/380#discussion_r1409949526

The above isn't actually possible since we can't take ownership of the node when traversing the trie. After looking into this a little further, it seems like the only difference between the two cases is that the traversed-node always goes through the callback when called at the start, and won't get passed to the callback when early returning.

If we renamed this method to traverse, and return bool that's true when all nibbles matched on nodes and false otherwise, we could get all the behaviour that we want at the call-site. That might require popping off the back of a Vec of traversed nodes, but that logic will be hidden by wrapper functions.

richardpringle avatar Nov 30 '23 22:11 richardpringle

On second thought, it would be way better to just implement an iterator (stream?)

// need two lifetimes otherwise the key has to live as long as the trie itself
fn get_keyed_iter<'a, 'b, K: AsRef<[u8]> + 'b>(&'a self, root: DiskAddress, key: K) -> KeyedIter<'a, 'b> { .. }

Then getting all the parents and the final node would look like this:

let root_node = self.get_node(root_addr)?;

let (parents, found_node) = merkle
    .get_keyed_iter(root_addr, &key)
    // `current` would be Option<ObjRef<Node>> 
    // depending on whether or not there is a node at `nib`
    .fold((Vec::new(), Some(root_node)), |(mut parents, parent), (current, nib)| {
        if let Some(node) = parent {
            parents.push((node, nib))
        }
    
        (parents, current)
    });

again, I want to emphasize that this will be surrounded by a wrapper function.

The above has the same behaviour as pushing all the parents and returning Option<NodeRef> for the found-node.

For proofs, we skip the root node anyway

let nodes = merkle.get_keyed_iter(root_addr, &key).filter_map(|(node, _)| node).collect();

We could actually go further with the latter case such that and create the hash-map directly which saves us a potentially large allocation (depending on the size of the key).

let proofs = merkle
    .get_keyed_iter(root_addr, &key)
    .filter_map(|(node, _)| node)
    .map(|node| node.get_encoded(self.store.as_ref()).to_vec())
    .map(|encoded| (<[u8; TRIE_HASH_LEN]>::from(sha3::Keccak256::digest(&endoded)), encoded))
    .collect();
    
Ok(proofs)

Further FYI, Iteraotr::Item: (Option<ObjRef<'a, Node>, u8), and it would always return Some on the first call, as long as there is a root node. If the root-node has no children, it would return Some((None, 0))

richardpringle avatar Nov 30 '23 23:11 richardpringle

Yes, I also think a node iterator would really solve this problem! We could even refactor the streaming iterator to do exactly this. IMO the priority of this isn't all that high right now though.

Callbacks, while supported in rust, aren't as common as iterators.

rkuris avatar Dec 01 '23 00:12 rkuris

Although I doubt this is still an issue, it should be checked in the new design.

rkuris avatar Jun 10 '24 20:06 rkuris

Fixed in new design

rkuris avatar Sep 29 '24 18:09 rkuris