substrate icon indicating copy to clipboard operation
substrate copied to clipboard

Make the genesis block building more flexible

Open liuchengxu opened this issue 2 years ago • 15 comments

The background is that we want to inject a DigestItem into the header of the genesis block, I thought calling frame_system::deposit_log() in genesis build should work, but it's not because right now only the genesis storage/state is taken care of and the genesis Digest is hardcoded as the default value.

https://github.com/paritytech/substrate/blob/1d8f7bf6de1f447b706a121c83e759da807d3a01/client/service/src/client/genesis.rs#L34

Since we already have built the genesis storage, we could extract the Digest value from the state and extend construct_genesis_block(state_root, digest) using it, so that people can use frame_system::deposit_log() during the genesis build as normal.

https://github.com/paritytech/substrate/blob/1d8f7bf6de1f447b706a121c83e759da807d3a01/client/service/src/client/client.rs#L335-L339

By adding this feature, people can have full control of building the genesis block.

liuchengxu avatar Nov 24 '21 07:11 liuchengxu

Why do you need to add a digest in genesis?

bkchr avatar Nov 24 '21 08:11 bkchr

@nazar-pc Since you have a better big picture than me, perhaps you could help explain why we need this feature? :P

liuchengxu avatar Nov 24 '21 08:11 liuchengxu

Subspace blockchain has proof-of-archival-storage consensus. That means in order to participate in block production farmer needs to prove storing unique [partial] replica of the blockchain history, blockchain itself. Pieces of the blockchain are plotted by farmers as blocks are produced by filling a special buffer. There is a threshold we call recorded_history_segment_size (RHSS), once there are at least RHSS bytes worth of the blockchain history, the segment is erasure coded, prepared and plotted by farmers.

Here is the problem: before we have the very first segment, farmers have nothing to plot and thus can't produce blocks. Chicken and egg kind of problem.

We solved that previously with a workaround where we had "pre-genesis" seed data prepended to the history of the blockchain that allowed us to bootstrap the network, but it was ugly and awkward to maintain.

In latest iteration we removed "pre-genesis" objects and decided to increase genesis block in size instead such that it alone exceeds RHSS, since genesis block is created unconditionally, will not be reverted, so farmers will be able to plot it right away.

The way to achieve that was to create a custom Block struct not based on generic one and customize its fn new with addition of the digest item: https://github.com/subspace/subspace/blob/5cac7d82e049a8f6ff6b0d47d60b462ff79f977d/crates/subspace-runtime-primitives/src/lib.rs#L119-L132

It would be nice if there was another way to customize genesis block (including digest items), but the lines mentioned in the first comment are buried deep in dependencies and the only way I found to achieve the goal was to customize the Block itself.

nazar-pc avatar Nov 24 '21 13:11 nazar-pc

Can you not just have a special case that ignores genesis? Aka doesn't require that the genesis block has this data?

bkchr avatar Nov 24 '21 14:11 bkchr

The issue is the opposite. It doesn't technically mean which data, but genesis block should contain some data that is at least RHSS in size or else farmers have nothing to plot and network can't bootstrap itself due to lack of the blockchain history.

nazar-pc avatar Nov 24 '21 14:11 nazar-pc

Not sure what "farmers" are, but why can they not do the following:

if header.number() == 0 {
    // use constant value
} else {
   // check digest
}

bkchr avatar Nov 24 '21 19:11 bkchr

Farmer is a separate application, similar to miner in other protocols. It plots SCALE-encoded blocks from the Substrate-based node to disk, plot is then used to solve challenges in order to participate in block production. It should be possible to retrieve blocks from farmer's plot as is and SCALE-decode them back into correct block struct or else we are not archiving the blockchain itself.

Blocks to farmers are opaque blobs, they don't look inside and don't have a direct way to interpret them.

nazar-pc avatar Nov 24 '21 19:11 nazar-pc

Okay, but they could use the logic I drafted above?

bkchr avatar Nov 24 '21 21:11 bkchr

We can probably hack something like that, but then the block you retrieve from the farmer network wouldn't be the same as from Substrate-based node. To fix that we'd have to pull Substrate types (or their approximation) into the farmer to parse and fix genesis block after the fact, which is a really ugly approach. Also there are potentially different pieces of software that can create or read that archival history and all of them will have to be aware of that weird exception that genesis block is.

We already achieved what we wanted with custom Block struct, the request here is primarily to make genesis block customizable in general.

nazar-pc avatar Nov 24 '21 21:11 nazar-pc

into the farmer to parse and fix genesis block after the fact

What? Why?

I really don't get your flow. Why do you want to modify the block? I'm speaking about a special casing in the one function that does the processing. Nothing more.

bkchr avatar Nov 25 '21 09:11 bkchr

Farmers archive blocks not for the sake of doing so, they do it so that the history of the blockchain can be recovered from plots, that is the point of the proof-of-archival-storage.

Hence the blocks being plotted, including genesis blocks, should be identical in Substrate-based node and in the plot. Since we need the genesis block be at least RHSS in the plot, it must be the same size in the node.

Node produces block -> block is archived and plotted by farmer -> block can be recovered by another node for sync process. Farmer network will replace archival nodes in our protocol. And hopefully not only in our network as we already archive Kusama and all parachains on our testnet too, so with adapter it will be possible to sync those from farmer network of Subspace too.

nazar-pc avatar Nov 25 '21 17:11 nazar-pc

@bkchr Happy to help if you have some ideas on implementing this generally nice-to-have feature. I can only come up with a kind of dirty way which is to extract the Digest value from the state using its storage key :(

liuchengxu avatar Dec 01 '21 13:12 liuchengxu

Sorry for the late answer. I'm going to hijack this issue now ;)

We should add a new trait:

trait BuildGenesisBlock<Block: BlockT> {
    fn build_genesis_block(self) -> Result<Block>;
}

The client should then take this trait for building the genesis block. This also means that the following code should be moved into the implementation of this trait: https://github.com/paritytech/substrate/blob/1d8f7bf6de1f447b706a121c83e759da807d3a01/client/service/src/client/client.rs#L335-L339

So we will create some type like:

struct BuildGenesisBlockWithStorage<BuildStorage>(BuildStorage);

impl<Block: BlockT, BuildStorage: BuildStorage> BuildGenesisBlock<Block> for BuildGenesisBlockWithStorage<BuildStorage> {
    fn build_genesis_block(self) -> Result<Block> {
        // build linked code above
    }
}

@nazar-pc @liuchengxu you will then be able to write your own implementation of BuildGenesisBlock that just wraps the one I sketched above.

I will assign this issue to someone in parity and it will be worked on in the next days/weeks.

bkchr avatar Jan 05 '22 13:01 bkchr

@bkchr Any update on this issue? I can help here if you want.

liuchengxu avatar Sep 17 '22 02:09 liuchengxu

Yeah you can take this @liuchengxu. Sorry for not yet having someone working on this. It fell under the radar.

bkchr avatar Sep 17 '22 05:09 bkchr