celestia-node
celestia-node copied to clipboard
[Feature Request]: Light Node Header Trimming
Implementation ideas
Motivation
The Header of a Light Node could get quite big for big Blocks. Let's say we have k = 128 ~ 8MB -> This will mean that we have 256 row roots and 256 column roots, which are already 96 bytes * 512 = 49152 bytes per Header. 1 Week's worth of Headers is equal to 49152 * 4 * 60 * 24 * 7 = 1 981 808 640 ~ 2 GB. With each square size increase, we would double the storage requirements. Can we have constant storage requirements not dependent on the square size?
Validator set optimizanzion
Another big part of the Size of the Header is the Validator keys. Currently, all validators are sent over the wire at each height. This can be reduced by only sending the diffs of the validator set changes. This means you don't have to change anything if the set does not change over a height. This optimization was already often discussed. It could help with bandwidth and storage.
When does a light node need access to the Header?
- We need the header to be able to DAS.
- We need the header to be able to verify FPs.
- We need the header to be able to verify inclusion Proofs.
Let's keep the header until we successfully sample the height but prune it immediately after. We will keep the header hash of the previous header to have a chain of header hashes that we can reference later. That way, we can reference it later. We took care of case one, but how do we take care of 2 and 3? We can take care of 2 by adding the header to the FP. This way, an honest full node can send the header with the FP, and when the light node receives it, we first verify the Header to the hash, then the FP.
Now for the inclusion proof, there are 3 Solutions.
- We send the header with the inclusion proof
- We save the data root per header and now have a reference to extend the inclusion proof from the row root to the dataroot. Maybe we can also change the header hash not to have to save the data root separately. Like hash( Header) into hash(hash(dataroot) + hash(rest of Header)) so we can have a mini inclusion proof of the dataroot to the Headerhash
- We have a timer of when we expect inclusion proofs, and when we receive those inclusion proofs, we prune the header. This would be specific to the logic that is building on top of Celestia.
This means we can reduce the 49152 bytes to 32 per height or by 1500x . This would stay constant with the size of the square. We can prune the header hashes after the FPW.
Possible Downsides
New light nodes joining the network will have a more challenging time syncing as the headers are unavailable throughout the network. You could keep a random sample of headers so on average, we will have good network health and syncing time is the same.
Deleting Samples
We could delete samples after block reconstruction time as they are not needed anymore.