celestia-app icon indicating copy to clipboard operation
celestia-app copied to clipboard

Adjust share size to match the spec and add namespaces to each Message

Open evan-forbes opened this issue 3 years ago • 10 comments

Namespaces are now kept in the actual message to ensure that they are recoverable, see the spec PR. This should also be accounted for in the app, specifically SHARE_SIZE - NAMESPACE_ID_BYTES - SHARE_RESERVED_BYTES

evan-forbes avatar Mar 19 '21 01:03 evan-forbes

To clarify:

  1. the share size that is erasure coded is still SHARE_SIZE
  2. SHARE_SIZE - NAMESPACE_ID_BYTES - SHARE_RESERVED_BYTES is how requests with a reserved namespace ID are chunked (txs, intermediate state roots, evidence)
  3. SHARE_SIZE - NAMESPACE_ID_BYTES is how messages are chunked

adlerjohn avatar Mar 19 '21 01:03 adlerjohn

Yeah, that should definitely be clarified.

When generating the commitments for SignedTransactionDataPayForMessage, namespaces are not added to each share after chunking the message. This is what specifically needs to be fixed.

The (WIP) ll-core PR #235 is using a const AdjustedMessageSize, and I think the app should import this.

evan-forbes avatar Mar 19 '21 04:03 evan-forbes

There is also something else to consider here: the spec currently assumes SHARE_RESERVED_BYTES to be one byte. That almost works because it is an index pointing to some starting point in a 256 bytes long slice. "almost" because if the index is > 127, its encoding currently will actually be 2 bytes long: https://play.golang.org/p/C_wN1E-lKRY

That is bc varint encoding encodes 7 bits at a time. See https://golang.org/pkg/encoding/binary/ for details on varint encoding. The implementation uses this encoding for unsigned ints (lengths/indexes), as almost all protobuf implementations will support this: https://developers.google.com/protocol-buffers/docs/techniques#streaming

Also, note that independent of the above issue, the current implementation does not use this index but rather length prefixed all messages (see https://github.com/lazyledger/lazyledger-core/issues/234)

liamsi avatar Mar 19 '21 08:03 liamsi

So...in my head canon I assumed it would just be 1 byte, but you're correct that canonical serialization is prescribed

(the * in the example layout figure below) is the starting byte of the length of the canonically serialized first request that starts in the share, or 0 if there is none, as a canonically serialized big-endian unsigned integer

Now we have a problem: if we use canonical protobuf, this could be 1 or 2 bytes.

  1. We can't always use 2 bytes because that's invalid canonical protobuf.
  2. We don't really want to use a variable number of bytes because it makes the chunking logic more complex...but we could.
  3. We could just not use canonical protobuf and instead just use the uint8 byte directly.

adlerjohn avatar Mar 19 '21 12:03 adlerjohn

We could just not use canonical protobuf and instead just use the uint8 byte directly.

I think this will be our best choice. It would work for the index that points to the start of a message. So it could always fit in one byte. IMO, the other thing, the length prefixing, should stay varint encoded. Simply because we don not really want to cap how long messages can get, right?

liamsi avatar Mar 19 '21 12:03 liamsi

Yes, the length can still be varint encoded because it doesn't affect chunking logic (there's only a single length).

adlerjohn avatar Mar 19 '21 13:03 adlerjohn

Can this be closed?

liamsi avatar Dec 05 '21 22:12 liamsi

I'm not sure, I asked about this during the off-site and didn't get a conclusive answer on whether it was implemented. cc @evan-forbes

adlerjohn avatar Dec 05 '21 23:12 adlerjohn

Sorry, I should have been clearer. The commitments that are generated do include the namespace of the message, but they do not add the length delimiter.

We also need more testing to compare the generated commitments with the subtree roots of an actual square, to make sure that we are doing this important step correctly in the future. This would also help test #49

https://github.com/celestiaorg/celestia-app/blob/1fcccc414127b1dd8399ad35051c8f669a98aa56/x/payment/types/payformessage.go#L140-L147

evan-forbes avatar Dec 06 '21 01:12 evan-forbes

Was thinking about his more, and while we could use code from core that already exists (done on the evan/user-core-code-to-split-shares branch) https://github.com/celestiaorg/celestia-app/blob/ad49814f62879319d60d722beb237bbdc21ba9fa/x/payment/types/payformessage.go#L111-L127

we could also move the commitment creation code to celestia-core, that way it would easier to roundtrip test

evan-forbes avatar Jan 19 '22 01:01 evan-forbes

I don't think we need this any longer, and am unsure of the context

evan-forbes avatar Jan 31 '23 22:01 evan-forbes