IPIP 0499: CID Profiles
Currently, CIDs can be generated with a variety of settings and optimizations for chunking, DAG width, and more. This means the same file can yield multiple, different CIDs depending on which tools and settings are used, and it is not possible to reliably reproduce or verify the CID.
This proposal introduces profiles for IPFS CIDs. Profiles explicitly define CID version, hash algorithm, chunk size, DAG width, layout, and other parameters. They can be used to verify data across implementations, provide recommended settings depending on retrieval performance goals, and more.
Thank you for kicking this off, and filling initial state.
I've incorporated specific "dag width" settings for File, Directory and HAMTDirectory nodes,
and updated the table to reflect state from https://github.com/ipfs/kubo/pull/10774
and profiles that exist in Kubo master branch: legacy-cid-v0, test-cid-v1 and test-cid-v1-wide:
- https://github.com/ipfs/kubo/blob/master/config/profile.go#L268-L307
Next:
- [ ] agree what "
cid-2025" profile should look like- this will be new default in "Kubo v1.0"
- we have
test-cid-v1andtest-cid-v1-widein Kubo as potential candidates
- [ ] switch to PR from local branch (so we have build preview)
- [ ] figure out how to render the information (currently the table is not supported by https://github.com/ipfs/spec-generator)
@jaller94 / @mishmosh, I've noticed a discrepancy in CID generation using the new test-cid-v1-wide profile and the CIDs generated with Singularity. I made a note of this in the following post on discuss.ipfs.tech along with this post outlining how Singularity has set dag-width.
This could be an edge case, as I'd expect the same CID to be generated. If necessary, would either of you be able to point me to a repo within the ipfs umbrella to post this as an issue?
I pushed a bunch of edits to move the conversation forward. This is sorely needed in the ecosystem, and the hope is that by building consensus we can improve developer experience when working with UnixFS and the overall health of the UnixFS ecosystem.
Feedback is always appreciated.
Hey, I'd love to be able to reference this, even if it's in "draft" form, could we just merge it and continue to iterate on top of it to get it right?
🚀 Build Preview on IPFS ready
- 🔎 Commit: 70514b9f4f16914c8d0b4a99d80883f902a3fe63
- 🔏 CID
bafybeieklx2odund2xybiw2c34edusicnnigox6ho6svlpb2y6plrprauu - 📦 Preview:
I made a few changes/fixes, aiming to land this early next week.
- Added links to UnixFS spec (now that it exists)
- Specified calendar versioning for profile names (line 64), per @b5 suggestion
- @lidel I gave the 3 kubo profiles names that matched the naming scheme. This would mean minor updates to kubo, but is probably better for future-proofing. Acceptable? Also happy to discuss live.
- Changed the "current defaults" section into a series of legacy profile names, that implementations MAY support. This allows those profile sets to be referenced/used across implementations.
- We were using
fanoutandbitwidthinterchangeably. I changed them all to fanout, in keeping with the UnixFS terminology. If we prefer bitwidth, I can PR that to UnixFS spec and then also here. - Streamlined lots of duplicate language from Summary and Motivation sections
Open questions:
- How to handle Test fixtures section (line 120)? (Not-blocking, IMO)
- Thread on empty directory filtering (blocking)
- Thread on threshold size (blocking)
Just synced with @lidel. He wants to ship this with test fixtures in place, (tracked in kubo/issues/11071). In the meantime, we don't anticipate changes to the profiles themselves so you can can reference this PR.
Great work, glad to see this!
Couple notes/questions:
- The profiles (legacy + new) don't say if the chunks are of a fixed size, or which algorithm they use.
- Small typo under "Compatibility": "support the the set of" (double
the) - Would it also be interesting to note if an implementation respects symlinks and if so, how the different kinds of symlinks are translated?