beaker-core
beaker-core copied to clipboard
Peer-socket API (WIP)
TODOS
- Internal
- track how many active tabs there are and leave all lobbies after a site is inactive
- reliably get the swarm topic of every connection
- rewrite to use only one swarm and multiplex
- connection deduplication
- cleanup on close
- API changes
- getSocket() method
- Change to
{readable, writeable}impl
- Questions
- what error conditions should bubble to the API?
- do we need a socket channel API?
- should we use a loopback (self) socket?
Target API
PeerSocketSwarm
// constructors
var swarm = await PeerSocket.joinSwarm(opts)
// opts:
// - `id` string. Specify the ID of an open swarm to join.
// properties
swarm.type // 'origin', 'open'
swarm.id // string, only set if type == 'open'
swarm.closed // boolean
// methods
var socket = swarm.getLoopback()
var socket = swarm.getSocket(id)
var sockets = swarm.getSockets()
await swarm.close()
// events
swarm.addEventListener('connection', evt => {}) // evt.socket is a PeerSocket
swarm.addEventListener('close', evt => {})
PeerSocketInfo
// properties
peer.id
peer.host
peer.port
PeerSocket
// properties
socket.loopback // boolean
socket.info // PeerSocketInfo
socket.swarm // PeerSocketSwarm
socket.readable // ReadableStream
socket.writeable // WriteableStream
// methods
socket.getSessionData()
socket.setSessionData(data) // only valid if loopback == true
await socket.write(msg)
await socket.close()
// events
socket.addEventListener('message', evt => {}) // contains evt.message
socket.addEventListener('session-data', evt => {}) // contains evt.sessionData
socket.addEventListener('close', evt => {})
// static methods
PeerSocket.setDebugIdentity(n)
Example usage
var swarm
setup()
async function setup () {
swarm = await PeerSocket.joinSwarm({id: 'pauls-chat-room'})
setInterval(() => broadcast('Hello!'), 5e3)
swarm.addEventListener('connection', listen)
}
async function listen ({socket}) {
for await (let msg of socket.readable) {
if (msg.type === 'text')
console.log(socket.info.id, 'says', msg.text)
}
}
function broadcast (text) {
for (let s of swarm.getSockets()) {
s.write({type: 'text', text})
}
}
Notes about Readable streams
The socket.readable attribute is a ReadableStream. With async iteration, that will enable the following usage:
for await (let msg of socket.readable) {
// will get hit every time a new message arrives
}
Because the socket has the {readable, writeable} attributes it fits the shape of a TransformStream and can therefore be used in pipeThrough().
var myRPC = createRPCInterface(...)
myRPC.readable.pipeThrough(socket).pipeTo(myRPC.writeable)
How multiple tabs & apps work
If an app has multiple tabs open, it will share access to the swarm and its sockets. The app needs to be aware of that and avoid duplicating behaviors. (It should coordinate with its other tabs.)
If multiple apps are in the same swarm (where an app is an origin, eg foo.com vs bar.com) then each app will have its own presence in the swarm and create its own sockets. That means a single device can have multiple presences in a swarm, one for each origin they have active in the swarm. (This decision was made because it'd be insane to force apps to coordinate with each other around shared socket ownership.)
Using setDebugIdentity for testing
When testing an app, it's helpful to run multiple identities in separate tabs as a way to simulate multiple peers. This can be done with PeerSocket.setDebugIdentity(), which takes in a number:
if (DEBUG) {
PeerSocket.setDebugIdentity(1)
}
var swarm = PeerSocket.joinSwarm()
// ...
This will cause the tab to join the swarm as a separate identity than the other tabs, with its own set of sockets and events. The identity used is the number, and any number of identities can be used. If multiple tabs use the same debug identity number, they will share the identity.
setDebugIdentity must be called before any other PeerSocket method.
Relationship with Hyperswarm
PeerSocket uses Hyperswarm. It automatically prefixes the "topic" identifers, thus the constructions are:
blake2b(`peersocket-open-${id}`) // for open lobbies
blake2b(`peersocket-origin-${siteOrigin}`) // for site lobbies
Open vs Origin swarm
An open swarm can be joined by any origin that knows the swarm name. It is automatically used when id is set.
An origin swarm can only be joined by pages that share an origin. This constraint is enforced by the browser, meaning a program outside of the browser could join the origin swarm. The origin swarm is used when no id is given for the swarm.
Future plans
In the future, this API will be expanded to support:
- Connection management (only open connections with candidates after app 👌s it, and a way to close connections)
- Authenticated connections
- Lobbies that are "owned" by specific users, enabling a kind of client/server relationship
What's the main difference here in how this works vs dat peers?
datPeers basically smuggles the messages over existing dat replication streams. In practice it's similar to PeerSocket.joinSwarm() with no id provided.
The biggest practical limitation of datPeers is that it only supports the origin swarm. PeerSocket.joinSwarm({id:}) makes it possible to join swarms across origins.
Over time it'll diverge more. Using separate sockets lets us construct the PeerSocket connections specifically to the apps needs, eg with authentication.
That's awesome. Especially if this is opening up the potential for other browsers to adopt a simiar API in the future.
One could potentially run IPFS over this. 😂
This could even be used to experiment with new hyperdb features without waiting for Beaker to get first class integration.
I think this will need to get additional error and open events as well as a close() method so that it can be more in tune with the APIs in RTCDataChannel and WebSocket. Another thing that's missing is the protocol field, but I'm not sure how realistic that would be without adding more overhead to the wire protocol.
Also, I think that closed should be replaced by a readyState variable to make it more in tune.
Also since we're dreaming here, it'd be amazing if it was async iterable. 😁
I agree about the readyState though ironically, despite having the same values, both APIs have different encodings. WebSocket uses numbers and RTCDataChannel uses strings.
For the first iteration, there's not going to be any connection management (no close(), no chance to decide whether to open new connections).
This looks amazing. 😍
@RangerMauve Thanks! Still making lots of updates as we build and learn the requirements but I'm feeling good about it
At a high level there are a couple possibilities I imagine for securing p2p channels.
Option 1: Provide crypto functions that encapsulate identity (Crypto Identity API)
Instead of integrating security into PeerSockets provide a more granular API for using the ed25519 Identity keys.
Eg.
const archive = await DatArchive.selectArchive({ authorizePrivateKey: true })
const encryptedText = await archive.publicKey.encrypt('super secret text')
const decryptedText = await archive.privateKey.decrypt(encryptedText)
const signature = await archive.privateKey.sign('super secret text')
Pros:
- Web devs have total flexibility in how they use the crypto functions to secure their data and can experiment with new security ideas.
- Higher level connection security libraries can proliferate and evolve more quickly when they are not part of the Beaker API.
- There is prior art: my GraphQLThings lib provides triple diffie hellman over Beaker's experimental datPeers API . I wouldn't expect it to be too much work to port this to a new PeerSockets + Crypto Identity API.
Cons:
- Less beginner friendly. Especially at first when there aren't many libraries and those libraries aren't mature.
- The crypto functions would need to be safeguarded against timing attacks if they are not constant time
- Gives web devs more opportunities to make mistakes in their PeerSocket security implementations
Option 2:
Integrate security into PeerSockets directly.
Eg.
const archive = await DatArchive.selectArchive()
const swarm = await PeerSocket.joinSwarm({
myIdentity: archive
peerIdentityPublicKey: await archive.readFile('peerIdentity.pub')
algorithm: 'triple-diffie-hellman-aes-256`
})
Pros:
- Initially beginner friendly because everything the web dev needs to secure their connection is in the Beaker docs.
Cons:
- Web APIs should not break backwards compatibility so the first version of the API should be well thought out for many use cases including more complicated options.
- Swarms of more then 2 users in communication may require complicated security which I expect will take time and expertise to develop. Developing a general purpose PeerSocket security mechanism might slow the release of Beaker@next and remove dev energies from other priorities.
Note So Option 2 is pretty much a Strawman here. When I started writing I thought there would be more reasons to support it but I haven't come up with them. If you can think of more reasons then let me know. Also there could be an Option 3 which would be a combined approach where we release both a secure PeerSocket API and a Identity Cryto API but I expect that the Secure PeerSocket API in that example would become a legacy maintenance headache with little benefit after community beaker crypto libraries are popularized.
With regards to PeerSocket security, I think we were going to get transport level security within hyperswarm in the form of the noise protocol.
I'm still catching up on this thread but is this feature perhaps a way to use hypercore feeds in Beaker? I recently became interested in that approach for storing sensor data efficiently (@dwblair). For the chat use case, perhaps as a way to store message history.
It could, but there's also the Feed API that would be nice to have. I don't have the link offhand.
We'll be adding the Hypercore API soon, so there shouldn't be any need to get PeerSockets involved.
I think PeerSockets is still going to be important for stuff like the multiplayer demo I did at the dat event at DTN. :P
We'll be adding the Hypercore API soon, so there shouldn't be any need to get PeerSockets involved.
I think @pfrazee you were referring to not needing PeerSockets to store a message history. Is that right? The difference is messages are saved when going over Hypercore as opposed to messages being ephemeral when going over PeerSockets?
Yeah I just meant, we wont need PeerSockets as a means to get access to Hypercores