fuel-vm icon indicating copy to clipboard operation
fuel-vm copied to clipboard

Finding the hash of a program in Fuel VM: hashing bytecode or use MAST root?

Open partylikeits1983 opened this issue 7 months ago • 1 comments

I am curious how the hash for a fuel predicate is computed. As I understand, the predicate hash is computed here: https://github.com/FuelLabs/fuel-vm/blob/2604237c9ff4a755e48b40b2c006711d22cff19f/fuel-tx/src/contract.rs#L72

using the root_from_code function:

    pub fn root_from_code<B>(bytes: B) -> Bytes32
    where
        B: AsRef<[u8]>,
    {
        let mut tree = BinaryMerkleTree::new();
        bytes.as_ref().chunks(LEAF_SIZE).for_each(|leaf| {
            // If the bytecode is not a multiple of LEAF_SIZE, the final leaf
            // should be zero-padded rounding up to the nearest multiple of 8
            // bytes.
            let len = leaf.len();
            if len == LEAF_SIZE || len % MULTIPLE == 0 {
                tree.push(leaf);
            } else {
                let padding_size = len.next_multiple_of(MULTIPLE);
                let mut padded_leaf = [PADDING_BYTE; LEAF_SIZE];
                padded_leaf[0..len].clone_from_slice(leaf);
                tree.push(padded_leaf[..padding_size].as_ref());
            }
        });

        tree.root().into()
    }

Looking at this implementation however, I was wondering why use a merkle tree to compute the hash of the predicate? In this implementation, the compiled bytecode of the program is passed in, divided into chunks, inserted into the merkle tree, and then the root of the merkle tree is found.

However, using a merkle tree to find the hash of a program is very similar to the concept of a merkleized abstract syntax tree (MAST). However, a MAST root, is the hash of an entire program, where the leaves of the merkleized abstract syntax tree are the "subprograms" of the entire program. MAST is used in bitcoin: https://github.com/bitcoin/bips/blob/master/bip-0114.mediawiki

My question is why is a merkle tree used to compute the hash of a fuel program? It seems that the function root_from_code does not need to use a merkle tree at all since all it does is divide the byte code into chunks, insert into a merkle tree, and then returns the root of the merkle tree. In this implementation, the bytecode in a single leaf could be from different logic flows in a program.

Why not just use a standard hash function like keccak256 for computing the hash of a program? Why use a merkle tree in this case?

partylikeits1983 avatar Jul 24 '24 12:07 partylikeits1983