merkle-tree icon indicating copy to clipboard operation
merkle-tree copied to clipboard

feat: optional encoder arg for standardLeafHash

Open joe-p opened this issue 10 months ago • 6 comments

This PR adds the ability to pass a custom encoder to standardLeafHash. This is primarily useful for developers that want to leverage the standard-v1 merkle tree in contexts outside of the EVM where the encoding may differ.

joe-p avatar Apr 25 '24 00:04 joe-p

Hello @joe-p

If you want to use a custom encoding, you should probably use the SimpleMerkleTree that lets you pass already leaves that are already hashed. This is not realeased yet, but its available on the master branch. Note that the SimpleMerkleTree also lets you customize the inner node-hashing function is you need that.

Let us know if that meets your needs (or not)

Amxx avatar Apr 25 '24 08:04 Amxx

Yes that is certainly a viable solution but I do think it would be nice to have a cross-chain standard for merkle trees (ie. standard-v1) and a library that can be used across the board. Of course there could be forks of this repo to change the encoding used at a deeper level or implementations on top of SimpleMerkleTree, but having a single package from a trusted source seems nice to have.

There is the problem that information about the encoding is lost in dump, but perhaps we could also formalize a way to define the encoder using such as encodingStandard: 'solidity-abi'.

I understand if this is deemed out of scope and a fork/higher-level implementation could be made for my use case (Algorand ABI), but thought it would be nice to see if we can come up with a solution upstream.

joe-p avatar Apr 25 '24 16:04 joe-p

There is the problem that information about the encoding is lost in dump

Yes that is defintielly a downside

Amxx avatar Apr 26 '24 09:04 Amxx

Currently, the StandardMerkleTree takes a "leafEncoding" that is a string[], and assumes the leafs are built by doing

keccak256(bytes.concat(keccak256(abi.encode(...values))))

or

keccak256(keccak256(defaultAbiCoder.encode(types, values)))

If we wanted to go in your direction, how would we do it? I guess we would keep the string[] part and we would add an option to change the keccak256(keccak256(defaultAbiCoder.encode(types, values))) part for something different? There is still the issue that this "something different" would be lost when you dump ... which makes the upside of this change a bit unclear to me ... Unless we don't open to any function, and we implement a limited set of options, with we can index in the dump.

I'd like to understand how much demand/need there is for it. I'd also like to know how many variant are realistically going to be used.

TLDR: The SimpleMerkleTree was our answer to "we can't possibly know all the hashing that are used out there, so we should leave that completelly open for anyone to use anything". Now what you are saying is that there are some "standard ways of doing things that should be included in StandardMerkleTree". I'm a bit concern about our ability to list and maintain all these "standard ways" for which support may be requested.

Amxx avatar Apr 26 '24 09:04 Amxx

we would add an option to change the keccak256(keccak256(defaultAbiCoder.encode(types, values))) part for something different?

I'm not proposing we change the hashing. Just the value encoding prior to hashing. The PR just adds a way to change defaultAbiCoder to any object that implements encoder: (types: string[], values: any[]) => string;

There is still the issue that this "something different" would be lost when you dump ... which makes the upside of this change a bit unclear to me

I think a solution to this is providing the encoder used in the dump output. By default it could be solidity-abi but other values can be set when changing the encoder. This would be provided by the developer using this library, not internally so it would not require any extra maintenance.

Now what you are saying is that there are some "standard ways of doing things that should be included in StandardMerkleTree"

I'm saying the opposite. Right now there is no standard way of doing merkle trees across multiple ecosystems. This library seems to be the standard way to do merkle trees for EVM, and I am thinking it would be beneficial for other ecosystems to adopt this standard and have one library to do it.

For some context, I am an engineer at the Algorand Foundation and planning on implement a merkle tree library in our tooling for smart contracts. As I mentioned, forking would be a viable option, but figured I'd try to see if we can find a solution to make this upstream library more flexible so it can be used by other ecosystems.

joe-p avatar Apr 26 '24 13:04 joe-p

I think a solution to this is providing the encoder used in the dump output. By default it could be solidity-abi but other values can be set when changing the encoder. This would be provided by the developer using this library, not internally so it would not require any extra maintenance.

This is similar to how custom hashing is supported. For encoding it would look something like this:

export interface CustomMerkleTreeData extends MerkleTreeData<HexString> {
  format: 'custom-v1';
  hash?: 'custom';
  encoding?: 'custom';
}

If encoding is set to 'custom' in a dump, loading it can only be done by providing a custom encoding function.

frangio avatar Apr 26 '24 14:04 frangio