multicodec icon indicating copy to clipboard operation
multicodec copied to clipboard

Add codecs for IPFS Cluster

Open hsanjuan opened this issue 6 years ago • 8 comments

It would be useful for Cluster to have our own block codes so that we can identify ownership of blocks from their CIDs.

cc. @lanzafame

hsanjuan avatar Feb 06 '19 12:02 hsanjuan

Are these IPLD codecs?

Stebalien avatar Feb 06 '19 20:02 Stebalien

@Stebalien if you mean that CIDs with these codecs will refer to blocks that decode to ipld.Node, yes.

hsanjuan avatar Feb 06 '19 22:02 hsanjuan

As in, will you be registering a new IPLD format?

Stebalien avatar Feb 06 '19 22:02 Stebalien

@Stebalien not yet. Right now, we're going to use existing cbor and protobuf and this would just work as aliases. But we might replace them in the future.

Also, @lanzafame thought that being able to decide if a block in the ipfs datastore belongs to IPFS cluster (by using our own codecs) might be very useful when doing disaster recovery.

Is it a problem to reserve codecs like this and we should stick to dag-cbor and dag-pb? (we could also use custom ones "unofficially" but that's not so polite).

hsanjuan avatar Feb 07 '19 15:02 hsanjuan

So, "codec" part of a CID isn't really supposed to be used as a "type" in this way. It's supposed to tell you how to interpret the target data as a structured IPLD object, not how to interpret the resulting IPLD object. They exist for interoperability with existing systems and to allow us to introduce new serialization formats with new features.

The drawback to adding new codecs like this is that everyone will need to completely understand that new codec to do anything useful with the data (e.g., follow a link). Let's say you're building a webapp and add a custom IPLD format + codec. Now you ask go-ipfs to resolve /ipld/CID_WITH_CUSTOM_CODEC/path/to/data. go-ipfs will need to support your app's custom format, even if you're really just using DagCBOR under the hood.

Stebalien avatar Feb 12 '19 01:02 Stebalien

The drawback to adding new codecs like this is that everyone will need to completely understand that new codec to do anything useful with the data (e.g., follow a link). Let's say you're building a webapp and add a custom IPLD format + codec. Now you ask go-ipfs to resolve /ipld/CID_WITH_CUSTOM_CODEC/path/to/data. go-ipfs will need to support your app's custom format, even if you're really just using DagCBOR under the hood.

I fully understand your point. For the sake of the argument, normally the node's data is opaque anyway, so the only useful thing for go-ipfs is Cid() and Link-related operations (ipld.Node). In the case of other apps, they still need to manually register the cbor or the protobuf codec along with the right decoder to work with these formats. In that light, It does not seem completely out of place to register the cluster codecbytes+decoder to understand nodes created by cluster for cluster, even if the underlying format happens to be dag-pb.

That said, it is in our interest that go-ipfs understands our blocks and I don't think my arguments are strong enough to support adding our aliases to go-ipfs codebase for what is already a supported format in the end.

Would reserving codecs for future use (in case we do create our own formats) be ok (I'd rename accordingly)? Or would you rather let this sit until that moment happens?

hsanjuan avatar Feb 12 '19 12:02 hsanjuan

Could you create a special reference type? Something like:

type ClusterRef struct {
  Cid cid.Cid
  Type string
}

Honestly, this is the "right" way to do this from IPLD's standpoint.

Stebalien avatar Apr 19 '19 00:04 Stebalien

@hsanjuan I think 5 years later we just do either dag-pb or dag-cbor, and don't need these?

lidel avatar Sep 06 '24 12:09 lidel