rsmt2d icon indicating copy to clipboard operation
rsmt2d copied to clipboard

`UnmarshalJSON` is limited to the default Tree

Open Wondertan opened this issue 2 years ago • 10 comments

Problem

When we deserialize the EDS we hardcode the default tree. This is problematic because computed Row and Col roots of the deserialized eds will be utterly different if another custom tree is used.

Solutions

  1. Have a global tree constructor var/registry that UnmarshalJSON would use instead
    • Users would then register their trees by name
      • The name would be serialized in the json and deserialization would then get the proper tree by the name
    • This is exactly what happens with the codec.
  2. UnmarshalEDS func that accepts the tree.
    • This works in theory, but https://github.com/filecoin-project/go-jsonrpc/blob/17a35772b24f59ceee25cc8d8177a2b255a659ac/client.go#L614 blocks it
    • If we contribute an option that custom Unmarshall, then we would be able to inject the one for EDS

Wondertan avatar Dec 11 '23 14:12 Wondertan

The first solution is bulletproof and requires minimum work outside of rsmt2d, as well the same approach is used for Codec, so we should go with it, imo

Wondertan avatar Dec 11 '23 14:12 Wondertan

We reencountered this issue while debugging a different bug. The UnmarshalJSON was not using the NMTWrapper, leading to a different data root in tests, which became a major time wasting redherring for the original bug. Besides, this issue means JSON serialization doesn't work.

The original problem was solved, but then reverted by https://github.com/celestiaorg/rsmt2d/pull/295. However, I don't understand the motivation behind the revert, @rootulp. You say it leaked implementation detail, but it's unclear what exactly was leaking there to warrant a revert without a follow-up fix.

Wondertan avatar Sep 11 '25 14:09 Wondertan

You say it leaked implementation detail, but it's unclear what exactly was leaking there to warrant a revert without a follow-up fix.

From https://github.com/celestiaorg/rsmt2d/pull/295 I said it leaks this responsibility to celestia-app: "Celestia-app needs to be careful to register all the appropriate trees once (and only once) before they are used (via Compute or Import)." That PR was a long time ago but if I recall correctly, it wasn't easy to bump to the breaking release of rsmt2d.

I haven't explored this deeply but instead of registering tree names in consuming apps, can MarshallJSON serialize the options that define the custom tree (e.g. square size, NMT options) and then UnmarshalJSON creates the tree using those options? This way, the JSON is self-describing and doesn't need an extra registry in consuming apps.

rootulp avatar Sep 11 '25 14:09 rootulp

can MarshallJSON serialize the options that define the custom tree (e.g. square size, NMT options) and then UnmarshalJSON creates the tree using those options?

@rootulp, it can be self-describing, and the original solution was too, by introducing tree field on the marshalled struct. It's just that the rsmt2d allowed users to register their own tree types, but they can be self-describing either way. The alternative is to make the tree optionality permissioned and allow only what's preregistered, just like codecs already on rsmt2d. I don't have a strong opinion on which way to go here, but if we make it permissioned rsmt2d would need to depend on nmt, which may not be a bad thing in the end.

Wondertan avatar Sep 15 '25 12:09 Wondertan

I don't understand how the original solution was self-describing if celestia-app needs a registry of tree names to tree options.

I'm in favor of a solution that doesn't need the registry in celestia-app because it was difficult to initialize the registry correctly. IIRC multiple packages in celestia-app were registering the same tree name multiple times.

rootulp avatar Sep 15 '25 13:09 rootulp

I don't understand how the original solution was self-describing if celestia-app needs a registry of tree names to tree options.

Well, self-describing does not imply a lack of registry. Having this field makes it self-describing, as it clearly describes the type of tree it is. The opposite would be making this fixed to a particular tree type, making it implicit rather than explicit and self-describing.

Wondertan avatar Sep 15 '25 16:09 Wondertan

I'm in favor of a solution that doesn't need the registry in celestia-app

All the solutions I can think of on rsmt2d level require this 🤷🏻. UnmarshalJSON interface doesn't allow one to pass options, or else.

difficult to initialize the registry correctly.

Do you remember what the difficulty was? I trust that it was, but I can't imagine what it could have been.

Wondertan avatar Sep 15 '25 16:09 Wondertan

Well, self-describing does not imply a lack of registry.

The JSON representation is not self-describing if it depends on a registry from tree name to tree options in order to deserialize. An alternative to writing the tree names in the JSON is to include all of the options needed to describe the tree in the JSON so that no registry is needed.

Do you remember what the difficulty was?

Yes, sorry to repeat myself but IIRC multiple packages in celestia-app were registering the same tree name multiple times.

rootulp avatar Sep 15 '25 17:09 rootulp

An alternative to writing the tree names in the JSON is to include all of the options needed to describe the tree in the JSON so that no registry is needed.

We can't declaratively describe logic in JSON that can interpret trees, nor can we extend the JSON parser to support trees. There inevitably needs to be some form of code that parses the tree programmatically outside of JSON parsing logic, and thus, there needs to be a way to point to that logic. Self-describing doesn't mean JSON can "understand" the tree, but the fact that the type of the tree is described in serialized form, similarly to how " describes string type in JSON.

Yes, sorry to repeat myself but IIRC multiple packages in celestia-app were registering the same tree name multiple times.

An idempotent Register function would resolve this issue.

Wondertan avatar Sep 15 '25 19:09 Wondertan

An idempotent Register function would resolve this issue.

Agreed.

rootulp avatar Sep 15 '25 19:09 rootulp