ipfs-car icon indicating copy to clipboard operation
ipfs-car copied to clipboard

fix:! do not wrap with directory by default

Open vasco-santos opened this issue 4 years ago • 13 comments

BREAKING CHANGE: do not wrap array inputs without path may result in different CIDs generated

On pack, unixfs importer will yield the added data + empty dir in the end. Which means that the CarWriter will receive all the blocks and the car file is well written, but the root CID will be the empty Dir.

So, what is the issue? When we pack, if we provide an ImportCandidate as new Uint8Array([1, 2, 3]) there will be no path to create DAG Links, which means it will write both the file block and the wrapping directory block into the writer, but as it cannot infer any relationship, it will be an empty directory. However, If I provide an importCandidate as { path: 'a.txt', content: new Uint8array([1, 2, 3] } , then we will actually get a non empty directory.

Given we provide a path (aka name) in web3.storage client, this is not a problem in web3.storage, nor in Node.js where we get the fs paths. I wonder if we should just make ipfs-car pack to not accept simple uint8array as input, despite they being perfectly valid if we do not use wrapWithDirectory . unixfs-import should probably handle this, but I don’t know if there is a reason behind the decision

vasco-santos avatar Aug 30 '21 15:08 vasco-santos

So we have at least 2 related issues here. What to about the automatic default wrap with directory, and what to do if the user explicitily requests a wrapping directory, when providing an unnamed input item... I think the range of options available to us are:

  • workaround: Invent a link name for the user. (kinda gross, but could be the CID of the item, or a link index number.)
  • workaround: skip the automagic wrapWithDirectory (still have to deal with where user asks for it explicitly)
  • avoid: Make input names a required parameter always.
  • fail: Throw an Error

Please add if there are other options

olizilla avatar Aug 31 '21 09:08 olizilla

The option implemented in this PR is the second: workaround: skip the automagic wrapWithDirectory (still have to deal with where user asks for it explicitly). I am not 100% sold on this, given the user can set wrapWithDirectory: true, but that does not gonna happen. But, it seems the better compromise given the known advantages of having wrapWithDirectory: true as the default. Throwing might be more problematic than making it work magically.

avoid: Make input names a required parameter always.
fail: Throw an Error

seem to be the same option here?

workaround: Invent a link name for the user. (kinda gross, but could be the CID of the item, or a link index number.)

We need to consider this will need to be implemented at the importer level if we name it CID, which means we need to land this on js-ipfs. Index number would work, but is probably odd for the user.

Overall, I prefer the skip with the automagic wrapWithDirectory silently, given most users will likely not care about using this option. More experienced users who will use it, will get informed of how it works.

vasco-santos avatar Aug 31 '21 09:08 vasco-santos

i think we set wrapWithDirectory: true by default to ensure that we always created a CAR with a single root. I'm not sure how important that is, as the ipfs-car code will deal with multiple roots.

what if we just flip the default back to wrapWithDirectory: false? We'd be more likely to create a CAR with multiple roots, but perhaps that is an issue for the calling code to deal with. Should the web3.storage client enforce "only create a car with a single root", as that is the assumption when handling user uploads.

My preference has always been "dont wrap unless we detect that we need to (to avoid multiple roots)". I think the only case we need to wrap is where we get more than 1 input and they don't share a base path.

olizilla avatar Aug 31 '21 09:08 olizilla

avoid: Make input names a required parameter always. fail: Throw an Error

these are almost the same, but if we require input names always, then the error is "you must provide an input name". Otherwise the error is "wrapWithDefault was set to true by default but you didnt provide an input name"

olizilla avatar Aug 31 '21 09:08 olizilla

i think we set wrapWithDirectory: true by default to ensure that we always created a CAR with a single root. I'm not sure how important that is, as the ipfs-car code will deal with multiple roots.

Yes, I think we are making ipfs-car decisions based on needs for web3.storage, and we should probably just handle this on the "user" .

So, yes I am in favour of:

what if we just flip the default back to wrapWithDirectory: false? We'd be more likely to create a CAR with multiple roots, but perhaps that is an issue for the calling code to deal with. Should the web3.storage client enforce "only create a car with a single root", as that is the assumption when handling user uploads.

thoughts @alanshaw ?

vasco-santos avatar Aug 31 '21 09:08 vasco-santos

I implemented the alternative we discussed so far at https://github.com/web3-storage/ipfs-car/pull/89

vasco-santos avatar Sep 02 '21 09:09 vasco-santos

AFAIK go-ipfs requires a name with our without wrap-with-directory.

alanshaw avatar Sep 02 '21 13:09 alanshaw

i think we set wrapWithDirectory: true by default to ensure that we always created a CAR with a single root. I'm not sure how important that is, as the ipfs-car code will deal with multiple roots.

If the intention of this library is to pack/unpack CAR files compatible with IPFS then is should probably not pack a CAR with multiple roots as it's currently not possible to then use it with IPFS.

My preference would be to keep the auto-wrap for avoiding multiple roots, but when it is enabled (auto or explicit), validate all the inputs have a path.

alanshaw avatar Sep 02 '21 13:09 alanshaw

@alanshaw can you clarify, is your prefence to always default wrapWithDirectory: true, or only for cases where it is necessary to avoid multiple roots?

if we're keen to enforce single root CAR files, then we should throw in the case where the user sets wrapWithDirectory: false but also provides inputs that would create multiple roots.

olizilla avatar Sep 03 '21 13:09 olizilla

the plan, from conversation with @vasco-santos and @alanshaw

  • test out what happens here if you try to add unnamed data and also specify wrapWithDirectory: true to ipfs-core and the also the go-ipfs http api.
    • if they fail in a similar way as ipfs-car does currently with an empty root dir, then raise an issue there as it's not a great dev experience. If it handles it in a different way, we should review and potentially adopt whatever that behaviour is.
  • fix it in ipfs-car to change the default to wrapWithDirectory: false
    • have it automatically set wrapWithDirectory: true where we need to to ensure the CAR has a single root.
    • fail with a useful error if passed both wrapWithDirectory: true and unamed data.

this will be a breaking change for ipfs-car so we should roll in a fix to align the sharding defaults with ipfs as per #92

olizilla avatar Sep 28 '21 09:09 olizilla

I confirmed that this is also a problem with current JS implementation and raised an issue to figure out a better solution across the board. Meanwhile, this should be ready to a final review so that we can get this and also #92 in as a breaking change

vasco-santos avatar Oct 18 '21 11:10 vasco-santos

@vasco-santos can you add a link to the upstream issue, just for completness.

olizilla avatar Oct 18 '21 12:10 olizilla

https://github.com/ipfs/js-ipfs-unixfs/issues/176 sorry, it was just linked from there to here

vasco-santos avatar Oct 18 '21 12:10 vasco-santos

@vasco-santos i think this PR got outdated with the v1 release. Am gonna close it out.

olizilla avatar Dec 06 '23 10:12 olizilla