Shareable private data
Our plan at this time is to implement a private data feature in Noosphere that enables users to author content that is private and/or only readable by an explicitly allowed audience.
This is the tracking issue for our progress towards shareable private data.
User stories
As a Subconscious user, I want to write private notes and share them with specific collaborators, so I can collaborate with those I trust.
As a Subconscious user, I want to write notes that are shared only with mutuals, so I can share with my cozyweb friends, and discourage my notes going viral outside of that local context.
Design notes
We intend to lean on end-to-end encryption strategies to enable this feature. In the default case, we assume that all data is publicly readable, but that some of that data is only made available in an encrypted state. Such encrypted data can be decrypted by the author, but may also be decrypted by an audience of others who the author explicitly addresses it to.
Fission and Peergos have done groundbreaking work building practical systems that achieve this quality, based upon the foundational concept of a cryptree. Fission's specialization of this is referred to as a cryptDAG and is implemented in their open source Web Native File System.
Serendipitously, Web Native File System is currently being re-written in Rust. This suggests that the shortest path to implementing shareable private data will be to incorporate WNFS into our design as it becomes practical to do so. WNFS is coherent with Noosphere as it is based on the same foundational concepts of CID-based content addressing and PKI.
Noosphere public content is non-hierarchical in nature, which means some of the qualities of of WNFS are not relevant for us in that case. It is possible (but not certain) that hierarchy has a place in Noosphere when it comes to shareable private data. To the extent that this is true, WNFS will seem more appealing. If it turns out that we can organize private data in useful ways without arbitrarily deep hierarchy, though, we may be able to implement a simplified solution to the problem.
I hope I'm not sidetracking this issue too much as this may be too immature an idea to integrate into a project like this, but an idea I've liked for a while is to implement some sort of HiStar-inspired system to help prevent private data leakage from peer-to-peer node implementations.
It's not an encryption thing nor an address to exotic side-channel attacks like timing attacks, but rather it's more analogous to a type checker for who is authorized to access data, to help avoid programmer-error accidental leakage of private information due to complexity-induced mis-design or mis-implementation of APIs.
It would look something like:
- the node is implemented in a way isomorphic to actors sending messages to each other
- each actor-like is marked with a "read-authorized" set, which may either be a finite set of DIDs, or some "all" value representing the set of all conceivable peers
- all information an actor-like can read must be ok to reveal to all peers within its read-authorized set
- for a message-like to be sent from one actor-like to another actor-like, the receiver's read-authorized set must be a subset of the sender's read-authorized set
- the initial actor-like created has a read-authorized set of "all"
- when an actor-like creates another actor-like, the new one inherits its creator's read-authorized set
- an actor-like replacing its read-authorized set with a subset of itself is a security-safe operation
- certain actor-likes are "read authorizers," which have the ability to receive message-likes from an actor-like with some read-authorized set, assert that it valid for some additional peers to read that data, and then relay the messages to an actor-like with those additional peers in its read-authorized set. This is security-unsafe--the read-authorization equivalent of unsafe blocks. This allows read-access logic to be kept simple, isolated, and explicit.
- authenticated data channels between two peers are actor-likes with their read-authorized set being those two peers
- publicly accessible resources on the internet are actor-likes with their read-authorized set being "all"
- local or otherwise private files on a system owned by a particular DID are actor-likes with their read-authorized set being that DID--alternatively, such files and databases may explicitly encode metadata about their read-authorized in the data storage
- readers of data streams with unknown publicity properties are actor-likes with an empty read-authorized set
- writers of data streams with unknown publicity properties are actor-likes with a read-authorized set of "all"
Without having read the encryption protocol details, Cryptopad may have dealt successfuly with similar challenges.