rsmt2d
rsmt2d copied to clipboard
Public API extension proposal
Motivation and Proposal
Currently, our block data retrieving strategy is this:
- Choose and request some random leaves - a quarter of all leaves.
- Wait until only that specific quarter is received.
- Force repairment.
The main drawback of the flow is that it does not use all the available leaves and instead optimistically relies only on a subjective random set of leaves. Instead, fetching block's data should try to retrieve all the leaves and repair block data on the go. This way, we would still fetch the same quarter, but naturally selected by network conditions and practical leaves availability on known peers/network.
So the flow will change to:
- Request all the leaves
- On every retrieved leaf check if we have enough for repair or wait for more.
- Execute repair and cancel the ongoing request.
Ideally, we should support custom fetching/repairing strategies similar to those, but for now, we can aim to support the most efficient one. However, to accomplish this we likely need to update the rsmt2d library API.
Implementation
Note: leaf == share
In case we agree that the proposed flow is a go, we can think of ways to implements it. The biggest issue for fetching/retrieving logic implementation is to know "if we have enough leaves". This can be accomplished through:
- Implementing a similar 2D bitmask like the one in the repo and parts of crossword solving logic. But that's a logic duplication and overhead as two places in code are doing the same.
- Calling
RepairExtendedDataSquareafter every new leaf retrieval. This is an overhead as it's a quite expensive operation due to sanity checks with lots of allocations in square flattening and etc. - Changing rsmt2d API to allow a user(fetching logic) to fill empty ExtenendDataSqaure leaf by leaf and to understand if that's enough for repair. This option does not require any custom logic on the user side and encapsulates it on the rsmt2d side allowing lib dev to squeeze max performance.
New methods/funcs
// EmptyExtendedDataSqaure creates an empty data square.
func EmptyExtendedDataSqaure(rowRoots, colRoots [][]byte, rsmt2d.Codec, TreeConstructorFn) *ExtendedDataSquare
// FillShare fills both erasure and data share at the specific coordinate.
// Returns true if has enough shares for repair.
// Concurrently unsafe.
func (eds *ExtendedDataSquare) FillShare(row, col uint32, data []byte) (bool, error)
// Repair attempts to repair an incomplete extended data square.
// Concurrently unsafe.
func (eds *ExtendedDataSquare() Repair() error
@liamsi, don't have enough rights to set labels/project
ref: #12
don't have enough rights to set labels/project
we should change that. The labels here are not really set up properly yet anyways. Regarding project/priority: I set this to devnet. The rationale is that this is "just" an optimization, although an important one. So we should decide on and implement it rather sooner than later. Or would you asses the priority differently @Wondertan ?
Rationally agree about the priority, but my gut tells me to start this sooner than later 😄
However, I do not entirely understand how to implement this, besides how to use the proposed API. Also, guess that @adlerjohn will do a better job here and maybe he already has some thoughts on how to.
Implementing a similar 2D bitmask like the one in the repo and parts of crossword solving logic.
We could simply export that logic and use it from several places.
I think the API you are proposing makes sense but it is "just" an optimization we can think about later.
While at it, I'd also like to improve the readability of some methods. e.g. by reducing the number of parameters for methods. For example, minor things like introducing a struct:
type Commitments struct {
rowRoots [][]byte
colRoots [][]byte
}
that can be used in all these functions.
Choose and request some random leaves - a quarter of all leaves.
Note that to guarantee we can reconstruct the block, ~3/4 of shares of the EDS must be downloaded. Any less and it's possible that the downloaded shares aren't sufficient to reconstruct the block with a adversarial block proposer.
We discussed this in some other comment I think, but having an exported function that allows recovering a single row like this one:
https://github.com/lazyledger/rsmt2d/blob/62f5a18e6b134793afca1ce284739ebbc5f2dcd7/extendeddatacrossword.go#L147
would be useful.
Note that to guarantee we can reconstruct the block, ~3/4 of shares of the EDS must be downloaded.
But it is still possible to recover with fewer shares, right?
Any less and it's possible that the downloaded shares aren't sufficient to reconstruct the block with a adversarial block proposer.
Ok, the interface I proposed assumes that. Can you pls confirm the possibility to implement it and in the best case scenario it will require only a quarter of the shares.
But it is still possible to recover with fewer shares, right?
Yes. If there's nothing malicious going on, you can download exactly 1/4 of EDS shares (i.e. the entire ODS) and recover the whole EDS.