go-hamt-ipld icon indicating copy to clipboard operation
go-hamt-ipld copied to clipboard

Represent bitwidth explicitly in root node

Open anorth opened this issue 5 years ago • 2 comments

As of https://github.com/filecoin-project/go-amt-ipld/pull/38 the AMT stores the bitwidth in the root node. Knowledge of the bitwidth is necessary to correctly interpret the blocks, so storing it makes the structure more self-describing.

I propose that we do the same thing for the HAMT. This will require a Root node structure, which the AMT already had but needs to be introduced here.

Thoughts @Stebalien @ZenGround0?

anorth avatar Dec 03 '20 02:12 anorth

We'd also need to specify the hash function (e.g., with a multicodec) and maybe a hash digest length? That shouldn't be terrible, but it'll take up a few extra bytes (likely 4?).

Stebalien avatar Dec 03 '20 03:12 Stebalien

Probably don't need the digest length since it's implied for most (all?) of the multihashes in the multicodec table. bitwidth and hash function multihash code are what we're storing in the IPLD hashmap spec. There's some discussion about whether this root should be a separate block but IMO it doesn't need to be since the root of a HAMT is a special thing anyway and can't be made use of as anything but the root, so you could just pack config into it.

We currently form a root a little like this (a single block with a nested structure):

type HAMTRoot struct {
  hashAlg Int
  bucketSize Int
  node HAMTNode
} representation tuple

type HAMTNode struct {
  map Bytes
  data [ Element ]
} representation tuple

# ...

so the "node" is nested within the root. An alternative form would be to nest the configuration inside "node", such that the root is the same format as every other node but it has this one additional field:

type HAMTRoot struct {
  map Bytes
  data [ Element ]
  config HAMTConfig
} representation tuple

type HAMTNode struct {
  map Bytes
  data [ Element ]
} representation tuple

type HAMTConfig struct {
  hashAlg Int
  bucketSize Int
} representation tuple

# ...

Or shaving off the additional List CBOR byte and sticking it all in the root:

type HAMTRoot struct {
  map Bytes
  data [ Element ]
  hashAlg Int
  bucketSize Int
} representation tuple

type HAMTNode struct {
  map Bytes
  data [ Element ]
} representation tuple

# ...

rvagg avatar Dec 03 '20 05:12 rvagg