rsmt2d icon indicating copy to clipboard operation
rsmt2d copied to clipboard

Public API extension proposal

Open Wondertan opened this issue 4 years ago • 10 comments

Motivation and Proposal

Currently, our block data retrieving strategy is this:

  1. Choose and request some random leaves - a quarter of all leaves.
  2. Wait until only that specific quarter is received.
  3. Force repairment.

The main drawback of the flow is that it does not use all the available leaves and instead optimistically relies only on a subjective random set of leaves. Instead, fetching block's data should try to retrieve all the leaves and repair block data on the go. This way, we would still fetch the same quarter, but naturally selected by network conditions and practical leaves availability on known peers/network.

So the flow will change to:

  1. Request all the leaves
  2. On every retrieved leaf check if we have enough for repair or wait for more.
  3. Execute repair and cancel the ongoing request.

Ideally, we should support custom fetching/repairing strategies similar to those, but for now, we can aim to support the most efficient one. However, to accomplish this we likely need to update the rsmt2d library API.

Implementation

Note: leaf == share

In case we agree that the proposed flow is a go, we can think of ways to implements it. The biggest issue for fetching/retrieving logic implementation is to know "if we have enough leaves". This can be accomplished through:

  • Implementing a similar 2D bitmask like the one in the repo and parts of crossword solving logic. But that's a logic duplication and overhead as two places in code are doing the same.
  • Calling RepairExtendedDataSquare after every new leaf retrieval. This is an overhead as it's a quite expensive operation due to sanity checks with lots of allocations in square flattening and etc.
  • Changing rsmt2d API to allow a user(fetching logic) to fill empty ExtenendDataSqaure leaf by leaf and to understand if that's enough for repair. This option does not require any custom logic on the user side and encapsulates it on the rsmt2d side allowing lib dev to squeeze max performance.

New methods/funcs

// EmptyExtendedDataSqaure creates an empty data square.
func EmptyExtendedDataSqaure(rowRoots, colRoots [][]byte, rsmt2d.Codec, TreeConstructorFn) *ExtendedDataSquare

// FillShare fills both erasure and data share at the specific coordinate.
// Returns true if has enough shares for repair.
// Concurrently unsafe.
func (eds *ExtendedDataSquare) FillShare(row, col uint32, data []byte) (bool, error)

// Repair attempts to repair an incomplete extended data square.
// Concurrently unsafe.
func (eds *ExtendedDataSquare() Repair() error 

Wondertan avatar May 26 '21 10:05 Wondertan

@liamsi, don't have enough rights to set labels/project

Wondertan avatar May 26 '21 10:05 Wondertan

ref: #12

liamsi avatar May 26 '21 12:05 liamsi

don't have enough rights to set labels/project

we should change that. The labels here are not really set up properly yet anyways. Regarding project/priority: I set this to devnet. The rationale is that this is "just" an optimization, although an important one. So we should decide on and implement it rather sooner than later. Or would you asses the priority differently @Wondertan ?

liamsi avatar May 26 '21 12:05 liamsi

Rationally agree about the priority, but my gut tells me to start this sooner than later 😄

Wondertan avatar May 27 '21 10:05 Wondertan

However, I do not entirely understand how to implement this, besides how to use the proposed API. Also, guess that @adlerjohn will do a better job here and maybe he already has some thoughts on how to.

Wondertan avatar May 27 '21 10:05 Wondertan

Implementing a similar 2D bitmask like the one in the repo and parts of crossword solving logic.

We could simply export that logic and use it from several places.

I think the API you are proposing makes sense but it is "just" an optimization we can think about later.

While at it, I'd also like to improve the readability of some methods. e.g. by reducing the number of parameters for methods. For example, minor things like introducing a struct:

type Commitments struct {
	rowRoots [][]byte
	colRoots [][]byte
}

that can be used in all these functions.

liamsi avatar May 28 '21 09:05 liamsi

Choose and request some random leaves - a quarter of all leaves.

Note that to guarantee we can reconstruct the block, ~3/4 of shares of the EDS must be downloaded. Any less and it's possible that the downloaded shares aren't sufficient to reconstruct the block with a adversarial block proposer.

adlerjohn avatar Jun 14 '21 12:06 adlerjohn

We discussed this in some other comment I think, but having an exported function that allows recovering a single row like this one:

https://github.com/lazyledger/rsmt2d/blob/62f5a18e6b134793afca1ce284739ebbc5f2dcd7/extendeddatacrossword.go#L147

would be useful.

adlerjohn avatar Jun 14 '21 12:06 adlerjohn

Note that to guarantee we can reconstruct the block, ~3/4 of shares of the EDS must be downloaded.

But it is still possible to recover with fewer shares, right?

Any less and it's possible that the downloaded shares aren't sufficient to reconstruct the block with a adversarial block proposer.

Ok, the interface I proposed assumes that. Can you pls confirm the possibility to implement it and in the best case scenario it will require only a quarter of the shares.

Wondertan avatar Jun 14 '21 14:06 Wondertan

But it is still possible to recover with fewer shares, right?

Yes. If there's nothing malicious going on, you can download exactly 1/4 of EDS shares (i.e. the entire ODS) and recover the whole EDS.

adlerjohn avatar Jun 14 '21 14:06 adlerjohn