helia icon indicating copy to clipboard operation
helia copied to clipboard

Inconsistent CID Calculation with Example: Addressing a file by CID with UnixFS

Open YangHYue opened this issue 8 months ago • 2 comments

Description: I encountered an issue while using the example "Addressing a file by CID with UnixFS". The CID of a specific file calculated in JavaScript does not match the result from the Go implementation. Additionally, I encountered an error while attempting to export the file.

Expected Behavior: The CID generated in JavaScript should match the CID generated in Go Go: bafybeidcrqr5y44vp6tw6haladhxs3wc6kdi3ttvlfg37o7slddmayrvqm. JS: bafybeifg4jtuuqlhjskm6oqq3usq23opono4ggjxkckzteazucnfu76thq

Actual Behavior: The CID generated in JavaScript is different from the expected CID. Moreover, I encountered the following error when trying to export the file:

Uncaught (in promise) NotUnixFSError: invalid wire type 4 at offset 26

Image

Code

      const file = event.target.files[0]
      const helia = await createHelia()
      const fs = unixfs(helia)
      const blockstore = helia.blockstore
      const cid = await fs.addFile({
        content: file.stream(),
        path: file.name,
      })
      console.log(cid.toString())
      const block = await blockstore.get(cid)
      const node = dagPb.decode(block)
      console.log(node)
      const entry = await exporter(cid, blockstore)
      console.info(entry)

File : Mega

YangHYue avatar Mar 13 '25 09:03 YangHYue

A couple of things to note regarding CID equivalency:

  • Filenames in UnixFS are stored by directory nodes.
  • When you add a file with Helia and pass a path, it will create a directory with a link to the file.
  • The equivalent in Kubo is to use the --wrap-with-directory option: ipfs add myfile --wrap-with-directory

Having said all of that, there are still some differences between Kubo and Helia when it comes to merkleizing data with UnixFS, see https://discuss.ipfs.tech/t/should-we-profile-cids/18507/37

2color avatar Mar 13 '25 11:03 2color

I recently investigated the topic of why CIDs for the same data are different between Kubo and Helia.

TL;DR: This is because of the DAG width used when encoding the data as UnixFS that Kubo uses is 174, while Helia uses 1024. We're working to add the ability to adjust this in Kubo.

We are also formalising these configuration options into a spec so that it's easier to reproduce the same CID across implementations

2color avatar Apr 03 '25 14:04 2color

Uncaught (in promise) NotUnixFSError: invalid wire type 4 at offset 26

I believe the content: value you are passing is not valid (file.stream()).

  1. https://github.com/ipfs/helia/blob/9902538348b6602e9a92cdc6ed38b61d57f7a41c/packages/unixfs/src/unixfs.ts#L39
  2. https://github.com/ipfs/helia/blob/9902538348b6602e9a92cdc6ed38b61d57f7a41c/packages/unixfs/src/index.ts#L64-L69
  3. https://github.com/ipfs/js-ipfs-unixfs/blob/7b178e98a40661059780dc16b6c630a261e1c38e/packages/ipfs-unixfs-importer/src/index.ts#L84-L85

You may want to use https://github.com/achingbrain/it/tree/main/packages/browser-readablestream-to-it or file.arrayBuffer() or other blob method that is iterable or a Uint8Array

SgtPooki avatar May 29 '25 12:05 SgtPooki

Closing as a more configurable version of kubo has shipped.

achingbrain avatar Oct 09 '25 06:10 achingbrain