grin
grin copied to clipboard
Block body serialization strategy (separate vecs of outputs and rangeproofs)
A discussion on the PIBD PR got me thinking more about full blocks and how we serialize them - https://github.com/mimblewimble/grin/pull/3453#pullrequestreview-507251519
We store outputs, rangeproofs and kernels in 3 separate PMMR data structures. i.e. Output and their associated rangeproofs are maintained in separate PMMR structures.
Ignoring the PMMR hashes and just focusing on the (leaf) data itself, we store them in the PMMR data files as follows -
[O1, O2, O3, ..., On]
[RP1, RP2, RP3, ..., RPn]
[K1, K2, K3, ..., Kn]
Ignoring pruning, these are strictly append-only, ordered lists of sequential data.
Blocks themselves consist of a block header and a block body containing vecs of inputs, outputs and kernels. In blocks we maintain the output (identifier) and the output rangeproof together as a tuple (O, RP).
The block body for the next block would then consist of the following data -
[(On+1, RPn+1), (On+2, RPn+2), ...]
[Kn+1, Kn+2, ...]
There is an awkward mismatch here due to fact we represent (and serialize/deserialize) outputs and their associated rangeproofs as tuples.
If we instead serialized block bodies such that outputs and rangeproofs were separate lists, we could stores these closer to how we store them in the PMMR structues.
[On+1, On+2, ...]
[RPn+1, RPn+2, ...]
[Kn+1, Kn+2, ...]
Writing outputs/rangeproofs/kernels from a new block into the PMMR data files could then be done directly without any intermediate steps to separate the outputs and rangeproofs.
This is not a high priority issue and its not even clear if this is a problem. But PIBD got me thinking about this and whether it makes sense to maintain blocks in a way that is closer to the underlying PMMR data structure itself.