nmt api: `Push` for data/shares with decoupled namespace

Currently, tree.Push requires passing namespace.PrefixedData, which is a byte slice prepending namespace.ID. However, there is a use case where we don't need to push data that is not prepended with a namespace.ID, and it can be passed as a separate field. This is the case where we push parity data generated via rsmt2d. Currently, we prepend parity namespace to the parity shares, which is not required. This is one of the issues causing us to store an additional 8 bytes of parity namespace per parity share via NMTWrapper, as outlined in https://github.com/celestiaorg/celestia-node/issues/183. I propose adding an additional method like PushWith to the nmt.Tree to allow passing decoupled namespace from the share data to allow mentioned disk usage optimization.

May 04 '22 15:05 Wondertan

This is interesting. In the past the Push method used to have these already separated: https://github.com/celestiaorg/nmt/blob/6e8a6a5ec3fcae4015ca5dd1923ce033a0458679/nmt.go#L254

We can revert to that! Back then there was a lot of discussions around serialzation (of course there always is 🙈 ) and I think (not sure anymore) the type aliases and the API was original meant to be able to take in a type that defines its own serialzation. IMO, the tree should just take bytes anyways and serialzation is sth that should happen before and not inside the tree or anything like that. Long story short: if we can save some storage and bandwidth with this simple change to the NMT, then let's do it! I propose to only have one push method instead an additional pushWith. Unless there is a good reason to keep both around.

May 05 '22 12:05 liamsi

After discussing with @Wondertan, we realized that https://github.com/celestiaorg/celestia-node/issues/183 can be tackled without modifying the NMT library at all (and instead by modifying this visitor). So we could put this on the back burner for now.

I still think the suggested change makes sense. It would also allow us to delete the PrefixedData type entirely. But it might make sense to do this as part of a somewhat larger refactoring (can/should be split into several PRs) where we also tackle #15. In it's current form and because of the usage of this interface we still need to do a copy which merges the nid and the data. Hlib convinced me that it is better to hide this copying inside of the NMT though. Currently, it happens in the wrapper which is bad usability. Especially as Hlib put it:

The reason is; right now, we can avoid storing unnecessary data, but it's hacky. Meaning that on Push we need to prepend, and on Visit remove that we prepend so we don't store it

May 05 '22 18:05 liamsi