radicle-alpha icon indicating copy to clipboard operation
radicle-alpha copied to clipboard

Roadmap

Open kim opened this issue 4 years ago • 2 comments

At Monadic, we've taken a short hiatus on Radicle to think about our internal strategy and reflect on how we're allocating our energy and resources. Here's an update on what came out of that:

Radicle, as conceived, is much more general than code collaboration: it allows one to give (almost) arbitrary semantics to deterministic replicated state machines. While this is interesting in itself, we don't think it is worthwhile for us to pursue a vision of general-purpose distributed computation. Instead, what we’re interested in is to make code collaboration work in a decentralised setting. This does include state machines modelling the lifecycles of collaborative interactions (issues, pull requests, etc.), yet it also includes sharing code and changes to it, which we think we haven’t captured too well with the current architecture.

So what does this mean?

Our biggest issues with our current stack have been, unfortunately, with IPFS. It is conceptually quite close to what we wanted (which is why we chose it in the first place), but turned out to be a bit unwieldy in practice:

  • It is a very heavy dependency, has a number of rough edges, and works like a monolithic black box from our point of view.

  • The current architecture of Radicle requires a daemon to be reachable on the network in order to receive updates from collaborators. For that, we relied on IPNS and IPFS PubSub, both of which are not particularly performant. In fact, we had to create our own IPFS network to make it work at all.

  • Lastly, replicating git repos peer-to-peer on the storage layer doesn't leave much choice but to use "loose" git objects in order to preserve content-addressability. That means, however, that we lose the main optimisation which makes git ~~fast~~usable for interactive use: packfiles.

    Note that this is not specific to IPFS: "feed" based systems like Dat and SSB have the same issue (GitTorrent does a lot better, but being tied to the BitTorrent network has its own drawbacks).

If we want a usable system for code collaboration, the last point is crucial. In our current approach, the source code is a bit of a 'second-class citizen': there are no strong guarantees that the repo is available, and if you're working with a large history you're better off hosting your code elsewhere. We've been embedding patches (as per git format-patch) in the Radicle state machine to be able to talk about contributions without the notion of a repo in Radicle itself. This works in practice (just as well as exchanging patches over E-mail), but the ergonomics quickly become questionable if you consider non-linear histories, and workflows where the submitter builds on top of patch series not yet applied to the published base. Also, we can't recover the repo from the Radicle state machine.

So what now?

The most obvious thought when reflecting on the above is: why not turn the system around, and use git itself to distribute data (it is designed to be used in a distributed fashion, isn't it?). Storing collaboration artifacts (issues, pull requests, review comments, ...) in git has been done before also, and the data structure of a commit history satisfies all storage needs for a state machine log. We will need to build our own networking layer, but we think that this is exactly what we need more control over, based on our experience with IPFS.

We’re still working on figuring out the details of this approach and plan to share a first draft of a protocol spec soon. Over the coming months, we’ll be focusing on implementing this lower level of the stack.

As for the code we already have, we want to:

  • Release radicle-lang separately

    We think that the "pure core" can and should be its own thing. The language may evolve on its own, and although it isn't going to be our main focus for the time being, a deterministic language with controllable effects is a good asset to have -- being "hackable" has always been a nice feature of Radicle which we don't want to give up on.

    Work already went into this (#684), but there's probably more to do until it could "graduate" into its own project. For example, we haven't made up our minds yet what to do with the existing rad code in lieu of a native packaging system. If you would want to help out on this, please reach out!

  • EOL the IPFS backend

    This would mainly mean that we shut down the seed nodes for our own IPFS network, as we don't have the bandwidth to properly operate them.

    We will be moving away from IPFS, and most likely also from the client-server architecture (which is why there hasn't been much progress on #670 for quite a while). We would be open to transitioning this stack to the community, if there is interest. Sunsetting the seed nodes is planned to happen once we have something new to check out.

We hope this gives you an idea of where we’re at. Please let us know of any questions or concerns here or #radicle on freenode.

kim avatar Oct 31 '19 15:10 kim

IPFS, Dat, SSB all have immutable append-only hash-based approaches, but git already does this better than most of them.

But this approach isn't the best for managing state.

State needs to be mutable, and fast.

Can I ask why you never considered GUN? We're faster than many centralized databases, yet completely P2P/decentralized and trustless (cryptographically secure).

amark avatar Jan 17 '20 19:01 amark

Hi @amark, I'm aware of GUN, it's a cool project!

From our own experiences, and those of others, we learned that the approach of replacing the storage layer of a distributed version control system (DVCS) with the goal of forming a peer-to-peer network yields an inferior experience for interactive use. If you look at a few different DVCSs, you'll notice that they all make very strong assumptions about both their distribution-, as well as their storage model. Conflating both into a storage layer which comes with its own network protocol leaves you with a design which essentially requires you to invent your own DVCS and implement a compatible mapping layer to the existing ones -- this is clearly not a scalable approach, and the merits of doing so are questionable.

What we're doing instead happens at a lower layer, essentially we're simply replacing the transport (and discovery), and tunnel the native protocols over it. This will allow us to take advantage of a lot of optimisations especially git has, and -- hopefully -- to add support for other DVCSs down the road.

That being said, there are still valid use cases for hosting repos on any of those distributed storage systems, including GUN, which can be simplified to: wherever you would be benefitting from a CDN, peer-to-peer storage is becoming a serious alternative.

Hope this answers your question. Happy to continue the conversation, here or over at our Discourse https://radicle.community

kim avatar Jan 21 '20 09:01 kim