decentralized-web-node icon indicating copy to clipboard operation
decentralized-web-node copied to clipboard

IPFS in Protocol Stack

Open WilliamTheHippo opened this issue 2 years ago • 18 comments

The current list of component layers in §5 lists all of the necessary components to implement DWN functionality, and the lowest level reads simply "IPFS". While I have no doubt that IPFS is the most appropriate way to host a DWN, they can also be hosted on centralized servers, torrent files, etc.

Maybe change "IPFS" to "Data Storage" and add a supplemental note below that IPFS is the recommended system for data storage? Or maybe even a section/appendix of recommended systems to implement each component in the stack, which would be more extensible. (We could recommend encryption standards, schemas, things for other components of the stack as well.)

WilliamTheHippo avatar Jul 06 '22 14:07 WilliamTheHippo

It should read IPLD Blockstore, as that's the only actual reliance within the spec.

csuwildcat avatar Jul 06 '22 14:07 csuwildcat

+1 IPLD

wyc avatar Jul 06 '22 14:07 wyc

I agree the storage layer should be an abstraction (I think), It seems requiring a decentralized approach of the data store really becomes pointless when it sits behind this services, or am I missing some other extensibility of the data store outside of the context of this service?

mweel1 avatar Jul 06 '22 16:07 mweel1

I agree the storage layer should be an abstraction (I think), It seems requiring a decentralized approach of the data store really becomes pointless when it sits behind this services, or am I missing some other extensibility of the data store outside of the context of this service?

I don't fully understand what you mean by 'this service'. Each instance of your DWeb Node personal datastore is a masterless clone, and the fact you may choose to have an instance remote of your devices, in addition to those on your devices, doesn't make the system centralized or a service in any typical sense.

csuwildcat avatar Jul 06 '22 17:07 csuwildcat

I personally would not like to see the "back-end" implementation of this standard in the specification. People should have a choice of the implementation if its going to be replicated, centralized, or whatever. The specification should just have the contract the service accepts and responds to IMOH.

If web hooks (which I think are required) are added, there are going to be a host of design decisions that are going to have to be made around local storage, queuing, which node is handling the messaging (its masterless?) etc. Do you really want to get into all of that, or just create the contract to accept a subscription, and what to expect on a publication?

I guess my point is, I don't want lock-in as it relates to the backend. Is that really required for this to be a success? Is this a standard or a product? If it's both, I would suggest separating them.

mweel1 avatar Jul 06 '22 18:07 mweel1

The data store should absolutely be abstracted out of the search and control components so that they may be separately assorted. Otherwise we have an unnecessary lock-in situation as well as a violation of data minimization fair information practice principles.

In cases where a data store (or other processor) does not recognize the authority of the resource owner as controller, the resource owner is forced to copy the data to a store that does recognize their choice of controller. This is NOT privacy by default since it forces the resource owner to choose between sharing their authorization policies with the data store operator or make a copy of the data to somewhere that does respect the resource owner's choice of controller. Either way, there is a violation of data minimization.

The essence the Patient Privacy Rights position was discussed in a panel at Identiverse. https://identiverse.com/idv2022/session/841489/ Here's the key slide describing separation and separate assortment between the authentication, authorization (policy), and persistence layers of a decentralized architecture.

https://docs.google.com/document/d/1gH1HVvOpJqLkg8BBbDCWh9SclDnJnztvd7x_YVVhtsw/edit

  • Adrian

On Wed, Jul 6, 2022 at 2:04 PM Mardo @.***> wrote:

I personally would not like to see the "back-end" implementation of this standard in the specification. People should have a choice of the implementation if its going to be replicated, centralized, or whatever. The specification should just have the contract the service accepts and responds to IMOH.

If web hooks (which I think are required) are added, there are going to be a host of design decisions that are going to have to be made around local storage, queuing, which node is handling the messaging (its masterless?) etc. Do you really want to get into all of that, or just create the contract to accept a subscription, and what to expect on a publication?

I guess my point is, I don't want lock-in as it relates to the backend. Is that really required for this to be a success?

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176521776, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YMRCUHBH4FLPBEEO73VSXDBDANCNFSM52Z4O3OQ . You are receiving this because you are subscribed to this thread.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176521776@ github.com>

agropper avatar Jul 06 '22 18:07 agropper

I am just getting my feet wet here, so let me start from the beginning.

A user would set up one of these decentralized web node (DWN) that provides functionality around mutating one or many data sources by the way of exposed services. That is the job of the a decentralized web node (DWN) correct?

Are you saying the security context between the DWN and the data would not be shared, I was not going that far with it.

mweel1 avatar Jul 06 '22 19:07 mweel1

Depends on what you mean by "security context".

GDPR, Zero-Trust and most other current security practice treat the distinction between data controller and data processor as fundamental. Adding "decentralized" to an otherwise invented name like Fred or Web Node does not alter the reality that making an actor play both controller and processor role is not recommended practice from either a privacy (GDPR) or security (ZTA) perspective.

  • Adrian

On Wed, Jul 6, 2022 at 3:21 PM Mardo @.***> wrote:

I am just getting my feet wet here, so let me start from the beginning.

A user would set up one of these decentralized web node (DWN) that provides functionality around mutating one or many data sources by the way of exposed services. That is the job of the a decentralized web node (DWN) correct?

Are you saying the security context between the DWN and the data would not be shared, I was not going that far with it.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176587483, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YN3B3U6SW3W3FVOOETVSXMC3ANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176587483@ github.com>

agropper avatar Jul 06 '22 20:07 agropper

I'll say again that the spec only mandates the IPLD conventions around the way data is chunked and CID'd, not the way you actually have to store the bytes. If you don't at least normatively define the way data is assembled (e.g. canonicalized and identified with CIDs) you wouldn't be able to have tap different instances be interoperable with each other.

csuwildcat avatar Jul 06 '22 21:07 csuwildcat

Data models like chunking and CID'd as well as DID and VC are essential for interop but separate from protocols and the roles they enable. The roles of various actors are enabled by the protocols and that's what impacts privacy and decentralization.

On Wed, Jul 6, 2022 at 5:14 PM Daniel Buchner @.***> wrote:

I'll say again that the spec only mandates the IPLD conventions around the way data is chunked and CID'd, not the way you actually have to store the bytes. If you don't at least normatively define the way data is assembled (e.g. canonicalized and identified with CIDs) you wouldn't be able to have tap different instances be interoperable with each other.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1176754583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YLOLBOPX6JQJBUYQULVSXZLTANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1176754583@ github.com>

agropper avatar Jul 06 '22 21:07 agropper

GDPR, Zero-Trust and most other current security practice treat the distinction between data controller and data processor as fundamental. Adding "decentralized" to an otherwise invented name like Fred or Web Node does not alter the reality that making an actor play both controller and processor role is not recommended practice from either a privacy (GDPR) or security (ZTA) perspective.

Why?

mweel1 avatar Jul 07 '22 04:07 mweel1

@mweel1 the storage layer is an abstraction. The JS reference implementation provides a MessageStore interface that defines all of the method signatures needed to perform storage, retrieval, and deletion of messages. The motivation behind providing this interface is to enable developers to use whichever underlying storage technology that best fits their needs/usecase e.g. mysql, mongo, LevelDB, cockroachDB etc.

mistermoe avatar Jul 07 '22 05:07 mistermoe

@mistermoe great! It sounds like there is a gap in the spec and the code then. I believe this is the right approach.

mweel1 avatar Jul 07 '22 06:07 mweel1

Resolved in latest commit via restating that the base dependency is only IPLD multiformats/codecs: https://identity.foundation/decentralized-web-node/spec/#protocol-stack

csuwildcat avatar Jul 07 '22 13:07 csuwildcat

Tagging pending close unless there are other concerns raised about the language change.

csuwildcat avatar Jul 07 '22 13:07 csuwildcat

"IPLD multifirmats" is not very precise. Are you using the IPLD data model and blockstore approach then multiformats is implicit. If you are using only multiformats for hashing data and creating CIDs (e.g. using the raw codec or similar) then you are not really using IPLD.

oed avatar Jul 07 '22 15:07 oed

What the spec uses:

  • DAG CBOR, PB / Unixfs
  • v1 CIDs

If you know a couple better overarching words to capture that for the diagram, then I think folks would be fine to change it.

On Thu, Jul 7, 2022, 10:19 AM Joel Thorstensson @.***> wrote:

"IPLD multifirmats" is not very precise. Are you using the IPLD data model and blockstore approach then multiformats is implicit. If you are using only multiformats for hashing data and creating CIDs (e.g. using the raw codec or similar) then you are not really using IPLD.

— Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/decentralized-web-node/issues/179#issuecomment-1177779782, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAFSTROAR36FKTV7SOLYTVS3YQFANCNFSM52Z4O3OQ . You are receiving this because you commented.Message ID: <decentralized-identity/decentralized-web-node/issues/179/1177779782@ github.com>

csuwildcat avatar Jul 07 '22 15:07 csuwildcat

@csuwildcat I would just say IPLD in this case! Also your attestation format is compatible with DagJOSE (which now is fully supported in go-ipfs) so using that will make your life easier when traversing DAGs.

oed avatar Jul 07 '22 16:07 oed