Block Header and Merkle Proofs
For this pr, I define:
- Client - A normal node peer storing, if wished, the entire chain of concern. For example, a desktop client node.
- Light Client - Node peers with limited and constrained resources which are only required to store the Block-Headers. For example, a web or mobile wallet.
- Stateless Client - For example, a smart contract in a chain in the outside world. They only need a single good known trusted hash to verify the zero knowledge proofs till that point in time.
A proper Block-Header is a single point of consensus, among all peers, which acts as a snapshot of the entire history of the respective blockchain state, up until the corresponding block height, in a compressed form.
It should be:
- A unique identity for a block.
- Compact, Precise and Elegant as it will be the face of the blockchain for the outside world.
- Simple and easy to develop around protocols as it will act as a bridge between the outside world.
- Able to verify the entire history of the chain state with minimal amount of resources possible in at most sub-linear complexity for both the time and the space.
- Able to allow the light clients to prune the chain state.
Other than these, the surrounding blockchain should support quick, generation of validity proofs, adding a new block with its header to the chainstate in sub-linear time, small proof sizes, etc.
Fortunately, unlike other blockchains, the way linera is, we don't have to send the blockheader to other peers. It can be derived at the either ends, saving the block space, as its not part of the consensus(yet, there is a small space where it can be, for the better).
And also improving on the recent implementation, we don't have to send the block execution outcomes, as they are deterministic and dependent on the inputs, greatly saving the precious block space.
But, what's the incentive for the clients to cryptographically commit to the block-headers?
The state of the chain can be check-pointed, a snapshot, and the history of the chain before that point can be safely pruned.
Further, to maintain those, likely to be millions of, transaction hashes, a index data structure is needed. After careful considerations, a merkle trie is a most likely candidate. A trie like data structure is used by every major blockchain, Ethereum uses a modified merkle patricia tree, Solana uses a form of concurrent tree, etc. And we require a trie data structure at many places anyway, like in linera-views, and in future to manage Accounts and chain ids.
For our purposes, a best fit would an Adaptive Radix Merkle Trie (ART). Details about it can be found in the original paper[1] below.
Adding a new block to the index require minimal samples
Reads are O(1)
Writes are O(1)
Adding a new block is O(logn)
Generating a proof is O(logn)
Verifying a proof is O(logn)
Proof size is O(logn)
Lastly, a merkle skip list is used to calculate the canonical block hash, which is the concatenation of the current root hash with the consensus hashes N - 1, N - 2, N - 4, N - 8 ... and so on. The benefits to this can be many due to the magical nature of numbers and cryptrography, but two main reasons are
- Early detection of forks and malicious peers at the point of fork.
- Possibility of Existence of stateless clients.
Links
- [1] https://db.in.tum.de/~leis/papers/ART.pdf