superchain-ops Run monorepo test suite against state resulting from proposal

This is discussed in https://github.com/ethereum-optimism/superchain-ops/issues/140#issuecomment-2075435988 through the use of forge-proposal-simulator (FPS). How to best do this needs some thought, and we should make sure our approach is forward compatible with batched upgrades (i.e. currently superchain-ops is focused on upgrading one task at a time but soon we will want to upgrade L1 contracts of many chains in 1 transaction—in that case we want to run the monorepo tests for each chain).

This is a big task, and likely will require changes both to this repo and the monorepo. We should check out how FPS does this, consider using that to avoid rolling our own solution, and may @ElliotFriedman will be kind enough to lend some time to help us out here :)

Oct 09 '24 03:10 mds1

This sounds like a perfect use case for FPS! I don't have context on how you guys structure your repositories or tests. Would you be able to summarize things here so I can make a recommendation on architecture?

Oct 11 '24 16:10 ElliotFriedman

Yes definitely! So I recall last time we chatted about this a tricky part was that the upgrade transactions are executed in this repo, but our test suite is in the monorepo (https://github.com/ethereum-optimism/optimism/tree/develop/packages/contracts-bedrock). However this repo does already have the monorepo as a dependency, so that might help

First, some background on the monorepo setup:

The core test suite is a standard forge test suite. We also have a few kontrol proofs, and a few differential ffi tests to ensure go and solidity implementations match.
We run all tests (including the go ffi tests) with just test, and the proofs are run with just test-kontrol.
Only running just test would be be a great start and kontrol can be added later.

Now on this repo, superchain-ops

"Tasks"are transactions to broadcast, like contract upgrade transactions or transactions that just change a config value. This is essentially the same as what Maker calls "spells"
All tasks live in a directory of tasks/{L1-network}/{taskName}, and contain input.json bundles that define the transaction to be sent, along with a single solidity script that simulates the task based on that JSON bundle.
Some tasks are "single" meaning it's a regular safe that needs to send the transaction, like the 5/7 Foundation Safe. Other tasks are "nested" meaning it's a 2/2 Safe that needs to send the transaction (where each signer on that 2/2 is a Safe, specifically the Foundation and the Security Council)
The core logic to parse the bundle and simulate the transaction happens using the https://github.com/base-org/contracts/ dependency. We run the simulation in foundry within that dep, and have a postCheck hook that runs a bunch of assertions on the post-simulation state

I would say it's safe to consider most options on the table. Some examples I thought about:

Does changing the shape of the input.json bundles help?
Does replacing base-org/contracts with FPS help?
Maker executes tests against their proposals like this, by basically defining a static test suite that always runs. This could also be an option for us, i.e. parameterize the key tests in the monorepo, that way we can import them here and run them

Oct 11 '24 21:10 mds1

Our current thoughts on how to run monorepo forge tests in the superchain ops repo post task execution:

setup anvil mainnet fork
run the tasks that are in non terminal state, applying the changes to the anvil fork
change directories into the monorepo submodule
run the monorepo just test commands against this changed state on the anvil fork. test-upgrade-against-anvil

Dec 17 '24 18:12 ElliotFriedman

After some consideration, we've landed on one possible solution to deliver this feature.

At a high-level, we want to be able to grab the resultant state from the task that ran and then use that state inside the monorepos tests.

To do this, we can use a combination of foundry cheatcodes. Firstly, we can use dumpState(string calldata pathToStateJson). This produces a JSON file that contains allocs created during the duration of the script. e.g.

{
    "0x5aadfb43ef8daf45dd80f4676345b7676f1d70e3": {
        "nonce": "0x1",
        "balance": "0x0",
        "code": "0x68.......",
        "storage": {
            "0x0000000000000000000000000000000000000000000000000000000000000000": "0x0000000000000000000000000000000000000000000000000000000000000001",
            "0x0000000000000000000000000000000000000000000000000000000000000001": "0x68656c6c6f00000000000000000000000000000000000000000000000000000a"
        }
    }
}

With these allocs, we can then use another foundry cheat code, namely loadAllocs(string calldata pathToAllocsJson). This loads in everything in the json file (addresses, code, nonces, balances, state etc). e.g.

vm.loadAllocs("../state3.json");
console.log(foo.val()); // would've reverted if loadAllocs had not have been invoked.

A note worth mentioning is that before we load allocations using the JSON file, we may want to do some processing to the file. As we know, deployments won't occur in superchain-ops tasks, so we may be able to remove any allocs that contain a code property, for example e.g. MultisigTask or AddressRegistry.

Each test in the monorepo has the following structure:

We will want to integrate our new logic into the above architecture. We don't want our new logic to live within Deploy.s.sol because we don't want to deploy a new chain (proxies, implementation contracts etc). The reason being that all superchain-ops tasks work on existing chain states and in most (if not all) cases, they don't deploy contracts as part of the task (think setting a new gas limit on an already existing SystemConfig contract).

Our new logic will want to leverage ForkLive.s.sol. Currently, ForkLive.s.sol reads addresses for a given network (mainnet/sepolia) from the superchain-registry and stores then so that they can be accessed later. This only grabs addresses for one chain at a time (e.g. base, zora or op), we may want to update this logic or loop at a higher level for all the chains in the superchain-ops task.

We will need to update many files, not limited to:

https://github.com/ethereum-optimism/superchain-ops/blob/main/single.just
https://github.com/ethereum-optimism/superchain-ops/blob/main/nested.just
https://github.com/ethereum-optimism/superchain-ops/blob/main/.circleci/config.yml
https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/justfile#L72
https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/test/setup/ForkLive.s.sol
https://github.com/ethereum-optimism/optimism/blob/develop/packages/contracts-bedrock/test/setup/Setup.sol

Jan 22 '25 22:01 blmalone

Once we dump the state we will need to do some processing on it. We know this because as part of the MultisigTask scripts, we deploy various contracts. They include the AddressRegistry.sol and the script contract itself. In the actual execution of the task, we know that these contracts are not deployed. Therefore, removing them is important to keep the allocs from the simulation as close to the production allocs as possible.

In order to correctly identify the allocs we need to remove from the dumped state, we want to make sure that the deployed contracts have a deterministic address. We have a few options to realize this. The leading approach is to use create3 (example of this being used in the monorepo). This is give us a deterministic address of the contracts that we deploy during a task. For the script contract itself, it's likely that this address is always the same and derived from a default forge address and nonce.

Jan 23 '25 19:01 blmalone

https://github.com/ethereum-optimism/superchain-ops/issues/343#issuecomment-2610867210 We're going to punt on state processing. For now, we don't know if we need this so we will delay it for now.

Jan 27 '25 17:01 blmalone

For performance reasons I would suggest only running the monorepo tests in CI on some schedule (e.g. every 4 hours) to avoid having to wait multiple minutes to simulate and execute tasks (e.g. if we ran tests as part of the simulate command)

Jan 28 '25 19:01 mds1

For performance reasons I would suggest only running the monorepo tests in CI on some schedule (e.g. every 4 hours) to avoid having to wait multiple minutes to simulate and execute tasks (e.g. if we ran tests as part of the simulate command)

What does this mean for CI integration testing?

Are you saying we shouldn't run monorepo tests everytime a new PR is opened on superchain ops? If so, what would you like the algorithm to be for running these tests in the superchain ops repo?

Jan 28 '25 19:01 ElliotFriedman

For performance reasons I would suggest only running the monorepo tests in CI on some schedule (e.g. every 4 hours) to avoid having to wait multiple minutes to simulate and execute tasks (e.g. if we ran tests as part of the simulate command)

Digging into the circle ci docs, what you are asking for is possible, it will take a bit of configurations that I won't be able to add as an external contributor.

The architecture you are describing is possible with the following steps:

when a new PR is opened a cron job runs and calls the circleci API which creates a new scheduled pipeline against that PR to run integration tests
when a PR is closed, that same cron job runs and calls the circleci API and closes that cron job
while the pr is open, the circleci scheduled pipeline runs on some predefined schedule

We can achieve the goal using circleci scheduled pipelines.

My recommendation would be to add this job to run against every commit on PR's. We could filter it down further by changes to toml or task config files, so that we don't waste rpc API credits on commits that don't make changes to the tasks.

If you mean we just need to run the integration tests against the main branch, making a single task that just runs against main every x hours is pretty straightforward.

When I asked the circleci docs bot about how we can achieve this objective it gave the following answer: --------------------------------------------------------------------------------------------------------------

Based on the knowledge sources provided, it is not directly possible to create a single scheduled job that runs against all PR branches on a schedule in CircleCI. However, there are some approaches you can consider to achieve similar functionality:

Scheduled Pipelines: You can create a scheduled pipeline that runs on a specific branch, but it won't automatically run against all PR branches. As mentioned in the Scheduled pipelines documentation, "A scheduled pipeline can only be configured for one branch. If you need to schedule for two branches, you would need to set up two schedules."
Dynamic Branch Selection: You could create a scheduled pipeline that dynamically selects branches to run against. This approach is described in the How can I dynamically select branches for a scheduled pipeline? article. You would need to create a custom script to get a list of PR branches and update the scheduled pipeline to run against those branches.
Workflow Filtering: You can use workflow filtering to run jobs on specific branches or tags. However, this is typically used for running jobs when commits are pushed, not on a schedule for all PR branches.
API-based Approach: You could create a separate scheduled job that uses the CircleCI API to trigger builds for all open PR branches. This would require custom scripting and is not a built-in feature of CircleCI.

It's important to note that CircleCI's default behavior is to run builds on every commit pushed to a branch, unless you've specifically configured it otherwise. As stated in a forum post: "All commits on all branches build normally, unless you have blacklisted or whitelisted them."

To achieve your specific goal, you might need to combine scheduled pipelines with custom scripting to dynamically update the branches being tested. This would require additional setup and is not a straightforward configuration in CircleCI.

Feb 04 '25 17:02 ElliotFriedman

Closed by: https://github.com/ethereum-optimism/superchain-ops/pull/555

Feb 05 '25 23:02 blmalone