ipfs-car
ipfs-car copied to clipboard
Root CID does not match js/go-ipfs when packing a dir with 10k sub dirs
Reported by @obo20
Singular file uploads went fine and then a folder of 100 files went fine, but when I tried with a folder of 10k files (a super common use case), I received a different CID with adding to IPFS (go-ipfs/js-ipfs) and then the ipfs-car output
the cid I get from go-ipfs and js-ipfs is: bafybeihq6az265aar27wuhzltxrgge5ywwllcgux7wui4z3ddq4i2cskky the cid I get from ipfs-car is: bafybeigww4x6shkc7vbp7c5slmnw3vo6ioj4gnar6ign5eqbkfpijcavk4
More context... there is a divergence in implementation of large directory sharding between go and js
Yes they diverge: Currently go-ipfs will either do no sharding (by default) or will shard even small folders (sharding enabled), while js-ipfs has a cutoff go-ipfs should for v0.11.0 have sharding enabled by default and is planning on sharding directories that are close to 1MiB in size (there's a deterministic function for this). js-ipfs may choose to do the same as well, but it's not strictly necessary. – @aschmahmann
The auto-shard PR for js-ipfs is here: https://github.com/ipfs/js-ipfs-unixfs/pull/171 The size limit is currently 256KiB but it'll align with go-ipfs before it's merged and will be overrideable by the user – @achingbrain
ipfs-car will adopt the changes in https://github.com/ipfs/js-ipfs-unixfs/pull/171 once they land so that it's CID derivation stays in sync with js-ipfs
For more context:
The code I'm using looks like:
const results = await packToFs({
input: contentFilePath,
output: `${destinationFolder}/data.car`,
blockstore: new FsBlockStore(),
wrapWithDirectory: false,
maxChildrenPerNode: 1024,
maxChunkSize: 262144
});
The npm versions look like:
"ipfs-car": "^0.5.8",
"ipfs-core": "^0.10.6",
As you've found the defaults are different. If sharding is not enabled in js-ipfs (the default) it passes
Infinity
forshardSplitThreshold
. ipfs-car passes nothing so ipfs-unixfs-importer falls back to it's default of 1000. So yes - you just need to expose ashardSplitThreshold
option in ipfs-car and pass it on to the importer, then you can manipulate the args to get the same CID as js-ipfs and go-ipfs. Though of course with the knowledge that the default behaviour is going to change soon(ish) to auto-shard based on final block size rather than number of entries in a directory. – achingbrain