backer icon indicating copy to clipboard operation
backer copied to clipboard

replication channel

Open juliangruber opened this issue 11 years ago • 22 comments

bittorrent sync uses a central dht/server and one unique id per user to connect machines.

It would be cool if we could do without any central piece, which on the other hand means:

  • the host needs to be able to open tcp connections
  • when a pc is behind a firewall we might have to do natpunching etc.
  • when a pc isn't a server it doesn't have a fix ip address yet, so at least one of your machines has to be a server, or you need to host one of those dyndns things

We could offer a broker service that you can use when you don't have a server maybe.

juliangruber avatar Sep 17 '13 13:09 juliangruber

Or people could host their own brokers which they can offer to their friends. And you could use multiple brokers per user, for redundancy!

juliangruber avatar Sep 17 '13 13:09 juliangruber

telehash was trying to do this, dunno what the status is

max-mapper avatar Sep 17 '13 15:09 max-mapper

This is really a whole other module/ecosystem. actually, there is a million things we could use connections between arbitary node instances for...

I see it as having 3 phases:

  • localnetwork/datacenter
  • servers (different datacenters)
  • and nat traversal.

You could detect the machines in the network by either nmapping (attempting to open connections to everything in the 192.168.0.* range or using udp multicast. This would work for localnetworks and datacenters, and make it configuration free.

For servers that have public ip addresses, you could have a start list, of one or more servers that you expect to be turned on, and then use a gossip protocol to track which machines are connected to the net work.

I'm not 100% sure how nat traversal works, but it would be awesome to have a node client that is compatible with webrtc, then you could have node<-> browser p2p connections! Maybe @feross can help answer this question.

dominictarr avatar Sep 17 '13 15:09 dominictarr

@dominictarr I see this repo more like jsgit, it's the end user thing but can consist of a core and many other modules. Will document that.

juliangruber avatar Sep 17 '13 15:09 juliangruber

@maxogden I think Telehash is still evolving nicely... http://github.com/quartzjer/thjs

Perhaps @quartzjer can enlighten us.

refset avatar Sep 17 '13 15:09 refset

We can divide this into two steps:

1. Discovering other peers, that belong to your own network

To be realtistic, only very little of the end users will have a client, that is constantly available under the same IP address or domain, which would be required for a fully self-sustaining p2p network. @juliangruber's broker idea is probably the most simple to implement. But after heaving a quick read through TeleHash (@maxogden, thanks for the pointer!) I'm really +1 on it. It seems pretty solid and well thought-out, plus there's already a package for that. (It doesn't really differ from the broker anyway.) To have a private broker/tracker we can implement a simple form of authentification and everything runs smoothly.

2. Keeping a connection to those peers

We could be using TeleHash for this aswell, but UDP is not reliable enough. So once the client has disovered a peer, that belongs to his network, it should open a permanent TCP connection to it (and every other peer as well). I don't think, that you're gonna have more than 20 clients per network, so this shouldn't be to bad. We then can use this connection to send events back and forth, like notifications about updated/added/removed files. The client then would open another connection to a peer to download the changes. This way the control connection always stays responsive and multiple file down- and uploads can run in parallel. Just like BitTorrent. This would also allow for a nifty load balancer, so we always get the change sets as fast as possible.

buschtoens avatar Sep 17 '13 16:09 buschtoens

Relays? Those clients would act as servers, that cache the data packets and forward them, but can't decrypt them. This can speed up file transfers.

buschtoens avatar Sep 17 '13 16:09 buschtoens

Chiming in a bit, I believe that telehash is actually a great fit, but I'm not sure on the timing, it's going to be a moving target for about a month here as we get about a half-dozen telehash implementations interoperating and work out the last kinks in the protocol...

Also, once two hashnames are connected via telehash, the protocol fully supports reliable (encrypted) raw data channels between them. I need some more examples of this pattern (one WIP is https://github.com/quartzjer/worm) and it's probably going to suffer some breaking on the horizon too, but that's definitely a design goal, to support full mesh p2p (there are well known / stable seeds so transient nodes are welcome) with strong data pipes :)

quartzjer avatar Sep 17 '13 23:09 quartzjer

how does telehash do binary data?

dominictarr avatar Sep 18 '13 12:09 dominictarr

The raw packet format has two parts, json and binary, so (reliable) channels can be created between any two hashnames that contain either or both... the binary is just raw bi-directional streams, nothing fancy :)

quartzjer avatar Sep 18 '13 23:09 quartzjer

perfect!

dominictarr avatar Sep 20 '13 11:09 dominictarr

Chiming in a bit late here.

@dominictarr: "I'm not 100% sure how nat traversal works, but it would be awesome to have a node client that is compatible with webrtc, then you could have node<-> browser p2p connections!"

NAT traversal is straightforward if you use WebRTC because the implementations are required to handle it for you, based on the spec. You just specify a STUN server, and that's it. If you want, you can also specify a TURN server for fallback if two peers cannot establish a direct p2p connection because they're both behind symmetric NATs (rare).

I don't know of an npm module that gives you a WebRTC client, though I don't imagine it would be too hard to make one. The WebRTC C++ code used in Chrome is open source. I think you'd just need to bind it to a JS interface, but I have no experience with native modules in node.

feross avatar Sep 23 '13 22:09 feross

aha! that is a great idea!

dominictarr avatar Sep 24 '13 03:09 dominictarr

relevant: http://www.youtube.com/watch?v=Al3SEbeK61s&feature=share&t=7m30s

guybrush avatar Sep 24 '13 03:09 guybrush

Hey All,

There's been a few attempts at writing node bindings to underlying webrtc stack in the chromium source (previously called libjingle). From what I've seen, node-peerconnection is a solid start for creating this and I've recently been updating it to work with the latest webrtc source code (see: https://github.com/DamonOehlman/node-peerconnection/tree/updated-basecode).

There are a couple of other modules out there too, I'll trying digging them up over the next couple of days and posting them here. Even if you decide to start from scratch then they'll be a good starting place (although I'd recommend using @rvagg's nan helpers as well.

Also as mentioned on twitter, I had a chat with @silviapfeiffer at work (NICTA) about this and we can see the value in getting some node --> c++ bindings written. At this stage though our approach will likely be to create the specific functionality we need at the c++ layer, wrap that into a c library and then create node bindings for that library. The primary reason for this being that the surface area of the underlying WebRTC c++ library is massive and also subject to quite extensive change as things get updated in the spec and thus chrome, etc.

Cheers, Damon.

DamonOehlman avatar Sep 24 '13 22:09 DamonOehlman

we'd mainly need the reliable datachannel, maybe could just bind to that? @DamonOehlman would the C++ library be compatible with WebRTC in the browser? If it was, I can see significant interest from people working on webrtc stuff!

dominictarr avatar Sep 24 '13 23:09 dominictarr

Targetting the data channel was what I was thinking too. Yeah, compatibility would be there (if not straight away, then eventually) - see https://code.google.com/p/webrtc/issues/detail?id=2279.

Just reading the issue thread in it's entirety now though, I'm not sure that WebRTC will remove the need for a central broker completely. While the WebRTC stack has everything it needs to do NAT traversal and successfully negotiate through firewalls (given that supporting network infrastructure is there - see ICE), you would still need to do the initial signalling between the peers through a broker of some kind.

Workload on the broker would be light though so not too costly. Might see if I can spend some time on this either later this week or early next :)

DamonOehlman avatar Sep 24 '13 23:09 DamonOehlman

it's also worth noting that the broker will be a completely standard thing, so it would be pretty easy to have many brokers, and allow anyone to run their own broker

dominictarr avatar Sep 25 '13 07:09 dominictarr

@DamonOehlman Could you post those alternate implementations of WebRTC Data Channel (or PeerConnection) on the server? I'm trying to see if it makes sense to use/improve one of them or just write my own.

feross avatar Oct 22 '13 22:10 feross

@feross This is the one that I think is looking most promising at the moment (cc @modeswitch):

https://github.com/modeswitch/node-webrtc

I'd probably recommend against starting writing your own (if you can resist) as there are so many node WebRTC binding libraries that have been started and then abandoned. It's a pretty big task because the surface area of the underlying webrtc libraries is pretty massive.

An alternative (and active) approach to have a look at is erizo in licode:

https://github.com/ging/licode/tree/master/erizo

These guys are putting together an implementation using the technologies that power the WebRTC stack (libstrp, etc). I don't think they've looked at data channels yet, and that may not even be on there radar given their strong focus on video/audio.

FWIW, my money is on approach 1 (bindings to webrtc c++ code which was previously known as libjingle). So again, have a look at what @modeswitch is doing and see if you can lend a hand there :)

DamonOehlman avatar Oct 22 '13 22:10 DamonOehlman

@DamonOehlman excellent - thanks for the quick response. i like approach 1 and will try to lend a hand :)

feross avatar Oct 22 '13 22:10 feross

To add a to what @DamonOehlman said above: I'm actively working on node-webrtc. Most of the API for data channels is there, audio/video support will come sometime later. Most of the issues I'm working on now are in libjingle rather than the node bindings. Support for building on Windows and OSX could benefit from additional contribution, and the module in npm needs testing and a bit of work I think.

modeswitch avatar Oct 22 '13 23:10 modeswitch