stacks-core icon indicating copy to clipboard operation
stacks-core copied to clipboard

[Do not merge] Feat/app chain mvp

Open jcnelson opened this issue 2 years ago • 10 comments

This PR implements initial working support for appchains. I have a write-up here on how to test a live deployment on testnet here: https://gist.github.com/jcnelson/c982e52075337ba75e00b79942164e31

What's an Appchain?

Appchains are instances of the Stacks blockchain that use another Stacks blockchain as their burnchain. You can "stack" appchains on appchains on appchains on appchains on ... on Stacks on Bitcoin, and Bitcoin will in effect commit to all of their respective chain histories in each block.

Appchains are one of a few different scalability approaches to Stacks. For developers who want to build an app that needs a lot of transaction bandwidth, needs its own token, and doesn't need to interact with many other apps' smart contracts, appchains may be a better solution than trying to cram all of their code and transactions into the main Stacks chain.

How It Works

Mining blocks in an appchain is a matter of mocking just enough of Bitcoin inside a "mining contract" so that we can treat a (list ) of (buff 80)'s (representing OP_RETURN payloads) and some other metadata as a "block," whose existence and value can be fetched and authenticated via a single RPC call and MARF proof verification. To this end, "blocks" are lists of operations stored in a specific data map within the mining contract (grouped by Stacks block height), whose type signature is standardized across all appchains. The mining contract public functions serve to append new payloads to this data map. A live contract can be found here: https://explorer.stacks.co/txid/ST17ABWV76GQCGWKFQR4G5N23HDXGZ3D3869A8Z5N.appchains-mvp-v1?chain=testnet

In addition, some runtime appchain characteristics are determined by the mining contract. Specifically, the mining contract defines:

  • The PoX constants for the appchain
  • A list of extra boot contracts to add to the boot block, in addition to .pox, .costs, .bns, and so on (which all come built-in).
  • The block limits for the appchain.
  • The chain ID for the appchain (which goes into both its p2p messages as well as its transactions, thereby making them non-replayable in other networks).

When an appchain boots up, it will first synchronize the host chain headers so it can fetch the data var that contains the appchain's runtime configuration. Once it has this, it will ask an appchain bootstrap node (listed in the runtime configuration) for the authenticated contract source for each boot contract listed in the configuration. Once obtained, it will execute its boot block and compare the resulting MARF index root to a config-given genesis boot block hash, and if it matches, will proceed with sync'ing the rest of the chain.

There are three major parts to this PR:

  • A burnchain driver for Stacks, which implements the BurnchainIndexer trait
  • A couple of extra RPC endpoints
  • A burnchain mining controller for the node that lets it mine blocks by sending contract-call?s to the mining contract.

The Burnchain Driver

This lives in src/burnchains/stacks. It implements a headers DB for Stacks blocks, as well as an RPC client for fetching and authenticating (via a MARF proof) data maps, data vars, and contract source from another Stacks chain. From there, it provides methods for downloading the canonical Stacks chain's headers and fetching block data out of the mining contract. It implements the BurnchainIndexer trait, so you can pass it into Burnchain::sync() and everything will work out just fine.

Extra RPC endpoints

This PR adds the following two API endpoints:

  • /v2/headers/{number-of-ancestors}?tip={indexed-block-hash}: This endpoint allows the caller to query up to 2100 ancestor headers of the given tip= query parameter value. This API endpoint is used by the burnchain driver to quickly fetch all headers on the canonical Stacks fork, and search for reorgs.

  • /v2/data_var/{address}/{contract_name]/{var_name}: This endpoint allows the caller to query the value of a contract's data var, as well as a MARF proof of its value.

The Burnchain Miner

This PR additionally adds testnet/stacks-node/src/burnchains/stacks_controller.rs, which implements the nuts and bolts of mining appchain blocks by sending contract-call? transactions to register VRF keys and send block-commits. In addition, this PR adds the relevant logic to "boot up" the appchain and obtain a Config structure with the mining contract's parameters filled in.


On top of these three features, this PR carries out the necessary refactoring to make the codebase less dependent on Bitcoin-isms to the point where it can index a Stacks chain instead without changing any of the internal burnchain or sortition logic. When the choice was available, I chose to emulate Bitcoin transaction structures in a bid to preserve compatibility.

jcnelson avatar Sep 27 '21 08:09 jcnelson

My strong preference would be for app chains to be maintained as a fork of this repo -- app chains should be orthogonal to the operation of normal Stacks nodes, and I am very wary of including code for supporting and running them in the main line of the stacks node (much of which impacts consensus critical portions of the codebase).

kantai avatar Sep 27 '21 14:09 kantai

I am very wary of including code for supporting and running them in the main line of the stacks node (much of which impacts consensus critical portions of the codebase).

I don't think this is true -- I went out of my way to not touch consensus-critical portions of the codebase. There have been no behavioral changes to the sortition DB, the burnchain DB, the burn operation processing, or burnchain transaction and block representations -- in fact, the Stacks burnchain indexer and appchain mining contract specifications both go out of their way to avoid the need to make any changes. The chainstate DB only gained the ability to stream Stacks block headers and query the genesis block MARF state root, but no functional changes were made beyond this to any existing methods. Both the Burnchain and StacksChainState structures have gained the ability to instantiate appchains via separate constructors, but their behaviors have remained the same with respect to the Stacks main chain.

I think the most hair-raising change to the Stacks chainstate is that now the post-flight callback in the boot code can return a list of transaction receipts, but this isn't consensus-critical. Appchains have a bespoke post-flight callback that instantiates their boot code smart contracts, but that's it -- it doesn't affect the Stacks main chain at all.

jcnelson avatar Sep 27 '21 16:09 jcnelson

I've separated out some of this PR into this other non-appchain-specific PR: https://github.com/blockstack/stacks-blockchain/pull/2862

jcnelson avatar Oct 01 '21 20:10 jcnelson

I have been asked why app chains code should live in the Stacks blockchain repo, as opposed to factored out as either a fork of the Stacks blockchain, or via some kind of library. Here is my rationale:

Stacks is, and has always been, a portable system

The SIPs that describe the design and implementation of the Stacks blockchain do not mandate that it is built on Bitcoin specifically. In fact, they are carefully worded to say "burnchain" and "burn tokens" to communicate that the chain is not married to a particular existing blockchain. The reasons for this are that (1) there is no guarantee that Bitcoin will remain the best chain to build on forever, and we need to account for that eventuality, and (2) there is always an opportunity to port Stacks to new chains with different security and performance characteristics.

The code today reflects this in three ways. First, it deliberately decouples the logic for processing a burnchain's transactions from any particular burnchain implementation by way of a BurnchainIndexer trait, so that multiple implementations of this trait may eventually exist in order to "port" Stacks to different chains. Second, it deliberately decouples the logic for mining blocks on a burnchain via the BurnchainController trait, so that the node can mine blocks on different ports. Third, it treats burnchain-specific data types as enums, with explicit and long-standing documentation stating that if new ports are created, new enum members would be added.

To me, this means that there's no design-level excuse for denying a PR that ports Stacks to a new chain. Making ports was always in the cards, and always a direction we were prepared to take the Stacks chain in. Porting Stacks is like adding drivers to a kernel -- the driver code lives in-tree once stable, and the bulk of the codebase is decoupled from driver implementations via well-defined interfaces composing a "burnchain abstraction layer" (our equivalent to a kernel's HAL). If we feel uncomfortable about a PR that adds a port because "much of [it] impacts consensus critical portions of the codebase" (a point I disagree with, but am more than willing to accommodate), then the real issue at hand is that the HAL separating burnchains from Stacks proper is too thin or leaky. If we fix that, then ports should be a lot less daunting.

That said, this PR treats appchains as a port -- it ports Stacks to itself. It creates an alternative BurnchainIndexer trait for pulling burnchain operations out of a Stacks smart contract; it creates a BurnchainController implementation for sending block-commit and vrf-key-register transactions; it adds burnchain-specific type variants to the otherwise generic burnchain enums for transactions and blocks.

Now that PR #2862 has been separated out, it's fair to say that the only thing this PR does in addition to carrying out the port is to try and make the driver interfaces less leaky. For example, a lot of changes to testnet/stacks-node are there precisely because the code was too tightly-coupled to the Bitcoin port.

I'm happy to spend some additional cycles making the HAL better before merging appchains. But before doing that, I want to make sure we're on the same page with respect to how we are going to treat ports going forward.

Ports should be in-tree

All appchains are instances of the Stacks blockchain; there's nothing special about the Bitcoin port of Stacks. All ports behave exactly the same way, modulo a handful of boot-time-configured protocol constants and burnchain-specific drivers. They all use the Clarity VM, they all have the same wire formats, they all have the same chainstate structure, they all have the same peer network design, they all have the same block resource rate-limiting, and they all have the same mining and validation algorithms (up to boot-time-configured protocol constants). Improvements made to this common infrastructure would benefit the Stacks chain proper as well as all ports (including appchains), which to me motivates keeping ports in-tree.

This argument for this is, again, basically the same as the argument as to why (open source) OS kernels tend to have most of their device drivers in-tree. One does not fork the whole OS kernel and maintain it as a separate project just to support some one-off piece of hardware -- the sheer maintenance burden for the driver author of having to maintain a whole OS kernel alongside it would be so severe as to prevent driver development altogether.

Treating Stacks ports as wholly-separate forks of Stacks is, to me, essentially asking for asking for fragmentation. If the only difference between ports is their burnchain interfacing code, then treating ports as forks is a lot like the act of treating new device drivers as kernel forks. The burnchain interfacing code is a lot smaller and more self-contained than a general patch against Stacks (which should motivate merging it), and its authors should not expected to maintain a full Stacks fork on their own just so they can store burnchain operations on some other chain.

Port Maintenance

That said, if we can be confident that our "HAL" for burnchains is sufficiently thick that it isolates new burnchain drivers from the rest of the codebase, I think we can start treating support for burnchains in a tiered way. We can say that we definitely support the Bitcoin port and maybe the Stacks port in a "Tier 1 status" manner, and we can accept and support 3rd party burnchain drivers in a "Tier 2 status" manner. In the latter case, we'd hold onto the code but be up-front that the core developers don't directly maintain it (much like how Linus Torvalds does not personally maintain each and every device driver in Linux). If other people are willing to put in the maintenance work for ports, then we should be happy to have them.


That all said, if I took the time to build out a "HAL" for accepting new burnchain forks and made it sufficiently thick that we wouldn't feel iffy about adding new burnchain forks, would keeping the Stacks port (i.e. appchains and maybe subnets) in-tree be sufficiently palatable?

jcnelson avatar Oct 01 '21 21:10 jcnelson

Ports should be in-tree

All appchains are instances of the Stacks blockchain; there's nothing special about the Bitcoin port of Stacks. All ports behave exactly the same way, modulo a handful of boot-time-configured protocol constants and burnchain-specific drivers. They all use the Clarity VM, they all have the same wire formats, they all have the same chainstate structure, they all have the same peer network design, they all have the same block resource rate-limiting, and they all have the same mining and validation algorithms (up to boot-time-configured protocol constants). Improvements made to this common infrastructure would benefit the Stacks chain proper as well as all ports (including appchains), which to me motivates keeping ports in-tree.

I disagree with this on several levels -- first, there is a "privileged" version of the Stacks blockchain as far as the codebase is concerned. There's only one genesis block data in this repo, for example, and I do not believe that it would make sense to admit alternative genesis blocks to the codebase even if they were just configurable by nodes. Secondly, it is only the case that all ports behave "the exact same way" as you describe when implemented as such. Nothing about the design of app chains necessarily requires that it use the same peering protocols, rate-limiting, etc. But fixing the design to the same repo will always encourage those designs.

Treating Stacks ports as wholly-separate forks of Stacks is, to me, essentially asking for asking for fragmentation. If the only difference between ports is their burnchain interfacing code, then treating ports as forks is a lot like the act of treating new device drivers as kernel forks. The burnchain interfacing code is a lot smaller and more self-contained than a general patch against Stacks (which should motivate merging it), and its authors should not expected to maintain a full Stacks fork on their own just so they can store burnchain operations on some other chain.

What's the maintenance overhead of managing an app-chain fork? Merging releases or develop back into the branch? The alternative is putting the burden of that maintenance on the supporters of the stacks-blockchain repo. This repo (correctly) has a very high bar for code reviews and testing, which may not make sense for app chains. The consequence of putting this all in the same repository is:

  1. Increased review burden for changes to app chain code: more reviewers are involved, with potentially more opinions (e.g., this PR, right now). This unavoidably slows development of app chains for the reasons mentioned above: this repository correctly has a high bar for reviews.
  2. Linking release processes of app chain code to stacks-blockchain releases. There's probably complex release management strategies that could manage to release them independently, but that would add further complexity to the release process. Without that, any changes to app chain code require new releases of the stacks-node.
  3. Increased maintenance burden for stacks-blockchain repository maintainers. To me, this is the biggest problem. Changes to the app chain codebase will require maintainers of the stacks-blockchain repository to review. Changes to the stacks-blockchain codebase that may impact the app chain changes would place additional development burdens on those changes. By putting them all in the same source control tree, these decisions are forced upon anyone who contributes the stacks-blockchain codebase or reviews PRs.
  4. Increased testing burden for the stacks-blockchain repository -- app chain test cases would now be included in the stacks-blockchain test suite, further lengthening the integration test time, PR review turnaround, and development time.

Port Maintenance

That said, if we can be confident that our "HAL" for burnchains is sufficiently thick that it isolates new burnchain drivers from the rest of the codebase, I think we can start treating support for burnchains in a tiered way. We can say that we definitely support the Bitcoin port and maybe the Stacks port in a "Tier 1 status" manner, and we can accept and support 3rd party burnchain drivers in a "Tier 2 status" manner. In the latter case, we'd hold onto the code but be up-front that the core developers don't directly maintain it (much like how Linus Torvalds does not personally maintain each and every device driver in Linux). If other people are willing to put in the maintenance work for ports, then we should be happy to have them.

That all said, if I took the time to build out a "HAL" for accepting new burnchain forks and made it sufficiently thick that we wouldn't feel iffy about adding new burnchain forks, would keeping the Stacks port (i.e. appchains and maybe subnets) in-tree be sufficiently palatable?

I'm not sure that any code changes would address my concerns above -- the problem isn't really a technical one (though aspects of it are: if burnchain modules could be guaranteed to not do anything malicious, not be compiled in the vast majority of builds, not do anything nefarious in our github actions, that would help us say "if you build a burnchain module, we'll just merge it without review), it's a question of maintenance and review burden. When making your argument about why this needs to be in the main source tree, the vast majority of the argument is that it makes maintenance easier -- but that's only true in that it makes maintenance of the app chain code easier by distributing that maintenance burden on the stacks-blockchain repo.

kantai avatar Oct 04 '21 18:10 kantai

I disagree with this on several levels -- first, there is a "privileged" version of the Stacks blockchain as far as the codebase is concerned. There's only one genesis block data in this repo, for example, and I do not believe that it would make sense to admit alternative genesis blocks to the codebase even if they were just configurable by nodes. Secondly, it is only the case that all ports behave "the exact same way" as you describe when implemented as such. Nothing about the design of app chains necessarily requires that it use the same peering protocols, rate-limiting, etc. But fixing the design to the same repo will always encourage those designs.

I think you may not have fully understood what I'm proposing for appchains. The Rust portion of each appchain's codebase is identical. Appchains would only be distinguished by a set of Clarity contracts that fully describe their chain-specific behavior. And, this repo would never, ever be responsible for hosting and testing these appchain-specific contracts -- that's very much not our problem.

To address your first point, you'll be happy to know that appchains do not get their own genesis blocks in the same way the Stacks blockchain does. In fact, they're designed to operate without them, for the very reason you point out -- we should not be in the business of having to host them or audit them or be responsible for them in any way. Specifically:

  • An appchain can have up to 128 initial token balances, which are embedded within its mining contract (so, they're not hosted here). The exact number 128 is up for negotiation, but they're deliberately represented in a smart contract on the Stacks chain in order for them to be (1) public, (2) indelible, and (3) not our problem.

  • An appchain can have its own boot code, which executes after the token balances are instantiated. In particular, the boot code can further set up initial token allocations.

    Appchain boot code contracts are not hosted in this repo either. Instead, the mining contract lists the names of the appchain-specific boot code contracts in the order in which they must be run. When bootstrapping, an appchain peer contacts an existing, running appchian node (whose IP address and ports are obtained from the mining contract) to download and run the extra boot code when it boots up. The mining contract also contains the final MARF state root of the appchain's genesis block, so the bootstrapping peer can validate the boot code's execution. Again, the purpose of this scheme is to make it so that we don't have to know or care about appchain-specific boot code -- it's entirely the responsibility of the people setting up the appchain to take care of writing and maintaining the boot code themselves (as described here). All this PR does on this front is enable this automatic bootup process to happen (i.e. this is handled in the bootup() function in src/burnchains/stacks/client.rs).

To address your second point, "fixing the design to the same repo will always encourage those designs" is deliberate and intentional. Appchains are meant to behave the exact same way as the Stacks blockchain -- they're meant to be a familiar system, and are meant to do things like interface with appchain-specific deployments of the Stacks explorer, the Stacks wallet, Stacks applications, stacks.js, and so on. Moreover, appchains are meant to run mining contracts for other appchains, so making it so all appchains behave like the Stacks blockchain is actually crucial to making this a viable scaling solution. Appchains have always been meant to be instances of the Stacks blockchain -- all appchains are merely deployments of the same port of the Stacks blockchain (i.e. the port of Stacks to itself as a burnchain).

Now, if you wanted an appchain with different mining rules, such as "no forks allowed" or "miners must produce blocks in a pre-determined order," this would be enforced by the mining contract, not the appchain codebase. Moreover, towards the bottom of my appchain explainer, I talk about how appchains would further have a special "consensus" boot code contract (distributed with the other appchain boot code contracts, btw), which lets the appchain developer specify rules in Clarity to (a) further constrain which blocks and transactions are valid on this chain, and (b) determine the token mint schedule. For example, an appchain might require miners to commit to additional appchain-specific data in their coinbases, or might require that no further smart contracts are instantiated beyond those in the boot code, or require a quorum of miners to sign off on each block, etc. To make this work, all that would need to happen in this repo is that we'd have the block- and transaction-processing logic (1) see if there's a "consensus" boot code contract, and (2) call into it to apply the additional validation rules (or get the number of tokens to mint) by passing in the block header or transaction as Clarity data to the contract and inspecting the return value (kind of like how we call into Clarity cost functions when evaluating transactions). This wouldn't affect Stacks at all, since there's no "consensus" boot code contract there, but it would make it possible for appchains to ship their own extra consensus rules (i.e. soft-fork rules) and token schedules without having to touch the Stacks blockchain code.

My point is, this PR deliberately avoids the need for us to host appchain-specific genesis data and code in this repo. Instead, it makes it so that appchains can automatically boot up with their on genesis data, genesis code, and (eventually) token schedules and consensus rules, given an initial appchain peer that the appchain deployer sets up. I very much agree with you that this repo shouldn't be in the business of hosting appchain-specific genesis data and code.

But that still leaves the question of where to host ports of the Stacks blockchain, bearing in mind that all appchains are instances of the same port. I'll get to that below.

What's the maintenance overhead of managing an app-chain fork? Merging releases or develop back into the branch? The alternative is putting the burden of that maintenance on the supporters of the stacks-blockchain repo. This repo (correctly) has a very high bar for code reviews and testing, which may not make sense for app chains.

I think we'd have the same maintenance burden either way, since we're still maintaining all the code that isn't appchain-specific. In my proposal, that's almost all of it. If we did it as you suggest, I strongly suspect that appchain developers would still open issues and PRs against the Stacks blockchain repo if there was a bug in the non-appchain-specific code, since it would likely mean that the bug is present in all appchain deployments as well.

To address your specific items:

Increased review burden for changes to app chain code

I don't think this will happen, since per the above design, we're not responsible for the appchain-specific code. That all gets distributed as Clarity code -- either in the mining contract or in the boot code -- and it gets written and maintained outside of this repository. We are, however, responsible for ports of the Stacks blockchain (i.e. to Bitcoin and to Stacks itself). But, this is much smaller and much less opinionated than the diverse body of appchains -- i.e. there is only one Stacks port.

Linking release processes of app chain code to stacks-blockchain releases

This is not a problem with my design. By abstracting the functional differences between appchains into Clarity code that the appchain deployers write and distribute on their own, all we'd be responsible for is the mechanism of deployment (i.e. the port), not the deployments themselves.

Increased maintenance burden for stacks-blockchain repository maintainers. To me, this is the biggest problem.

I agree that hosting two ports of the Stacks blockchain instead of one will increase our maintenance burden. However, we'd only be maintaining the ports, and their implementations would be tied to one or more SIPs that describe how exactly the Stacks blockchain interfaces with these underlying burnchains. I am happy to take point on any reviews on PRs that affect the Stacks port (but note that in my proposal, there is only one port here, no matter how many appchains, since the functional differences between appchains are meant to be entirely encapsulated within Clarity code distributed to them out-of-band).

Increased testing burden for the stacks-blockchain repository

We'd only need to test the Stacks port. Each appchain developer is responsible for independently testing their appchain-specific Clarity code.

if burnchain modules could be guaranteed to not do anything malicious, not be compiled in the vast majority of builds, not do anything nefarious in our github actions, that would help us say "if you build a burnchain module, we'll just merge it without review),

Sure, I'm fine with that. Let it be the case that the burnchain-specific code paths in ports that we don't give Tier-1 status will not compile by default and will not run in our test suite. But remember, in this proposal, I'm only adding a single port, and it covers all the appchains.

When making your argument about why this needs to be in the main source tree, the vast majority of the argument is that it makes maintenance easier -- but that's only true in that it makes maintenance of the app chain code easier by distributing that maintenance burden on the stacks-blockchain repo.

If the upside to maintaining a single Stacks port is that we now can support ~4.1 billion appchains (i.e. they're identified by a 32-bit chain ID field in the transaction and p2p message wire formats), then that's a cost I will gladly bear myself. Again, I am not, and have never been asking that we maintain appchain-specific code or state.

All I'm asking is that we maintain a port of Stacks to itself, and make a couple aspects of the block-processing logic scriptable via a "consensus" boot contract. That's it. These two changes should be enough to give developers plenty of expressive power to create and deploy their own appchains with wildly different behaviors, all without any additional changes to this codebase.

jcnelson avatar Oct 04 '21 21:10 jcnelson

What's the maintenance overhead of managing an app-chain fork? Merging releases or develop back into the branch? The alternative is putting the burden of that maintenance on the supporters of the stacks-blockchain repo. This repo (correctly) has a very high bar for code reviews and testing, which may not make sense for app chains.

I think we'd have the same maintenance burden either way, since we're still maintaining all the code that isn't appchain-specific. In my proposal, that's almost all of it. If we did it as you suggest, I strongly suspect that appchain developers would still open issues and PRs against the Stacks blockchain repo if there was a bug in the non-appchain-specific code, since it would likely mean that the bug is present in all appchain deployments as well.

This PR is, by itself, 10,000 lines of changes, and I wouldn't imagine this is the end of app-chains specific changes over time. Nothing would prevent you from managing a stacks-appchains repo that hosts this code. Separating it from this repo makes the review process explicitly different than the review process of the stacks-blockchain repo, and would enable you to maintain the Stacks blockchain-on-Stacks port independently of this repo. You would not need reviewers from this repository.

Increased maintenance burden for stacks-blockchain repository maintainers. To me, this is the biggest problem.

I agree that hosting two ports of the Stacks blockchain instead of one will increase our maintenance burden. However, we'd only be maintaining the ports, and their implementations would be tied to one or more SIPs that describe how exactly the Stacks blockchain interfaces with these underlying burnchains. I am happy to take point on any reviews on PRs that affect the Stacks port (but note that in my proposal, there is only one port here, no matter how many appchains, since the functional differences between appchains are meant to be entirely encapsulated within Clarity code distributed to them out-of-band).

This sounds okay, but it would require explicitly modifying the normal review process for this repository. This repository requires 2 reviewers on each PR. Additionally, it is very likely that you will author app chains PRs (like this one), and you can't review your own PRs. If there's going to be different review and maintenance processes for apps chains code, then having a separate repo makes it very clear that it is the case.

I'm still just not sure what the downside is of having a separate repository. The argument for a separate repository is that the stacks-blockchain repository is maintained and contributed to by multiple network participants, and including code in this repository explicitly expands the scope of that maintenance. A separate repository makes it very clear where maintenance responsibilities lie, and who needs to be involved in reviews. The downside of having a separate repository, from what I can tell, is that it would need to merge changes back from the main repository, which doesn't seem like a high price to pay.

kantai avatar Oct 05 '21 14:10 kantai

This PR is, by itself, 10,000 lines of changes, and I wouldn't imagine this is the end of app-chains specific changes over time.

Today, the Stacks port -- i.e. the burnchain indexer for Stacks and the miner -- is "only" 6300 lines (including unit tests, which are over 2000 lines). It's actually smaller than the Bitcoin port, even when src/deps/bitcoin isn't counted. Another 1600 lines are adding /v2/headers and /v2/data_var and respective unit tests (now in this PR: https://github.com/blockstack/stacks-blockchain/pull/2862). The remainder is fleshing out the burnchain HAL so the system can run on either Bitcoin or Stacks without being married to either. Turns out, there were more than a few places in the Stacks node code that assumed that we would only be talking to Bitcoin.

Nothing would prevent you from managing a stacks-appchains repo that hosts this code. Separating it from this repo makes the review process explicitly different than the review process of the stacks-blockchain repo, and would enable you to maintain the Stacks blockchain-on-Stacks port independently of this repo. You would not need reviewers from this repository.

I'm not asking for any changes to the code review process. However, I'm happy to temporarily maintain a fork of the Stacks blockchain with the appchain features until they're stable, and there are one or more ratified SIPs that describe their design and implementation. But I do not think it benefits the ecosystem to have to deal with two (or more) permanent Stacks blockchain repositories that only differ in which burnchain they support. To me, this sounds like creating multiple Linux kernel repositories that differ only in which CPU architectures they support. You really do not see any path by which the Stacks port can get upstreamed?

This sounds okay, but it would require explicitly modifying the normal review process for this repository. This repository requires 2 reviewers on each PR. Additionally, it is very likely that you will author app chains PRs (like this one), and you can't review your own PRs. If there's going to be different review and maintenance processes for apps chains code, then having a separate repo makes it very clear that it is the case.

I'm not proposing that we change anything about our review process. A port of Stacks to a new burnchain is a feature PR, just like any other feature PR. I was saying here that if you're worried about the burden of having to deal with PRs sent by other people that touched this code, then I'm happy to take point on reviewing them, simply because I'm the primary author of this part of the codebase (kind of like how you'd take point on reviewing nontrivial features added to Clarity).

The downside of having a separate repository, from what I can tell, is that it would need to merge changes back from the main repository, which doesn't seem like a high price to pay.

This goes both ways. I could make improvements to appchain-agnostic aspects of the appchain-ified Stacks codebase that would then need to be pulled back into the Stacks blockchain. In fact, this has already happened, since the solution to https://github.com/blockstack/stacks-blockchain/issues/2826 wasn't apparent until after I'd deployed and tested the first appchain.

jcnelson avatar Oct 05 '21 15:10 jcnelson

Spoke with @kantai about this some more. The main objection to merging appchains into this repository -- which I now agree with -- is that this repository already tries to do too much. In the medium/long term, this repository needs to be broken up. Specifically, the factoring we're aiming for is as follows:

  • A repository and crate for just the Clarity VM, as a library
  • A repository and crate for the Stacks blockchain "core" library, which depends on the Clarity crate
  • A repository and crate for the Stacks blockchain port to Bitcoin, whose build artifacts include the Stacks genesis block and stacks-node binary, and which depends on the above two crates
  • A repository for Stacks blockchain tooling, such as the contents of tools/, bns-test/, etc.
  • A repository and crate for the Stacks blockchain port to Stacks, whose build artifacts include the initial appchain tooling

We kinda-sorta treat the Stacks "core" library separately from the node already, so splitting out the Bitcoin port shouldn't be too big of a job. The biggest task here I think is in figuring out how much of the node functionality can be moved to the "core" library, since it'll likely be reusable across ports.

With this factoring, I think the roadmap for appchains is as follows:

  • [ ] I will finish the remaining technical work on appchains (e.g. support for the consensus boot code, etc.)
  • [ ] I will write up a SIP for appchain support, and try and get it through the SIP process
  • [ ] I will work on splitting this repository into a core library and Bitcoin port
  • [ ] I will ship appchains as the Stacks port, and have it depend on the core library

Sound good?

jcnelson avatar Oct 06 '21 13:10 jcnelson

Codecov Report

Merging #2857 (a38ed27) into develop (b94c815) will decrease coverage by 53.93%. The diff coverage is 9.01%.

Impacted file tree graph

@@             Coverage Diff              @@
##           develop    #2857       +/-   ##
============================================
- Coverage    82.80%   28.86%   -53.94%     
============================================
  Files          248      256        +8     
  Lines       196530   202612     +6082     
============================================
- Hits        162734    58494   -104240     
- Misses       33796   144118   +110322     
Impacted Files Coverage Δ
src/blockstack_cli.rs 0.00% <0.00%> (-83.10%) :arrow_down:
src/burnchains/appchain.rs 0.00% <0.00%> (ø)
src/burnchains/bitcoin/blocks.rs 28.76% <0.00%> (-63.77%) :arrow_down:
src/burnchains/stacks/client.rs 0.00% <ø> (ø)
src/burnchains/stacks/db.rs 0.00% <0.00%> (ø)
src/burnchains/stacks/mod.rs 0.00% <0.00%> (ø)
src/chainstate/coordinator/tests.rs 0.00% <0.00%> (-95.57%) :arrow_down:
src/chainstate/stacks/miner.rs 12.43% <0.00%> (-80.66%) :arrow_down:
src/main.rs 0.00% <0.00%> (-0.12%) :arrow_down:
src/net/chat.rs 29.35% <0.00%> (-52.15%) :arrow_down:
... and 254 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f6c78b0...a38ed27. Read the comment docs.

codecov[bot] avatar Jan 14 '22 07:01 codecov[bot]

This needs to be a separate repository.

jcnelson avatar Feb 07 '23 16:02 jcnelson