cosmos-sdk icon indicating copy to clipboard operation
cosmos-sdk copied to clipboard

feat(sim): port simulator to run on top of testutil/network

Open elias-orijtech opened this issue 1 year ago • 10 comments

This change implements a replacement for the current simulator based on testutil/network. Most of the changes are porting the module specific message generators to no longer rely on SimulationState, and to generate "real" messages, not simulator messages. The simulator driver is in simapp, as part of the IntegationTestSuite.

The new approach aims to improve simulation in two important ways:

  • Simulation should more closely mimic a real network. The current simulator message delivery is implemented parallel to non-simulator message delivery, leading to loss of fidelity and higher maintenance. One symptom is https://github.com/cosmos/cosmos-sdk/issues/13843.
  • Simulation should be layered on top of modules, not part of modules. This means that modules should not import simulation packages, nor refer to its generator package (x/module/simulation). This should eventually fix https://github.com/cosmos/cosmos-sdk/issues/7622.

There are also downsides, however. Where the current simulator is too high level, testutil/network is too low level: it runs a real network of validators which is difficult to control. For example:

  • AppHashes differ between runs, because modules may depend on non- deterministic state such as block header timestamps.
  • The validators runs in separate goroutines, which makes it hard to query app state without introducing race conditions.
  • Blocks are produced according tot time, and not under control by the test driver. This makes it hard to trigger processing of messages in particular blocks, which ruins determinism.

Some of the issues may be worked around, for example by forcing the block headers to be deterministic; however, the real fix is to make testutil/network itself deterministic, providing the goldilock level of simulation: close enough to a real network, yet deterministic enough to generate the same chain state for a given random seed.

A deterministic testutil/network is part of https://github.com/cosmos/cosmos-sdk/issues/18145.

Future work includes:

  • Porting of the remaining module message generators.
  • Generating (and verifying) deterministic AppHashes, allowing reliable replay when a problematic message is detected. Depends on https://github.com/cosmos/cosmos-sdk/issues/18145.
  • Save/reload of state for faster debugging cycles.
  • Removal of the old simulator, most importantly the reference to it from module code.

Updates https://github.com/cosmos/cosmos-sdk/issues/14753 (Simulator rewrite epic) Updates https://github.com/cosmos/cosmos-sdk/issues/7622 (reducing imports from modules to simulator) Updates https://github.com/cosmos/cosmos-sdk/issues/13843 (using real message delivery for simulation)

CC @odeke-em @tac0turtle @alexanderbez

Summary by CodeRabbit

  • Refactor

    • Streamlined simulation setup by removing dependencies on the auth module across various test suites.
    • Simplified the initialization of simulation managers to default settings in multiple test files.
    • Updated the auth module's initialization process by excluding specific arguments related to random account generation.
  • New Features

    • Introduced new global variables and functions in testutil_network_test.go to enhance network and application configuration for testing.
    • Added new configuration options for network fuzz testing in testutil/network/network.go.
  • Bug Fixes

    • Adjusted the RandomGenesisAccounts and RandomizedGenState functions in the x/auth/simulation package to accept new parameters and return updated types, ensuring correct initialization of genesis states.
  • Documentation

    • Updated function names and signatures in the x/auth/simulation and x/bank/simulation packages to better reflect their purpose.
  • Chores

    • Removed unused constants and functions across various simulation and operation files to clean up the codebase.

elias-orijtech avatar Oct 02 '23 00:10 elias-orijtech

Walkthrough

Walkthrough

The changes across various files in the Cosmos SDK-based application suggest a significant refactoring of the simulation framework. The auth module's simulation imports and dependencies have been removed, and the simulation setup has been simplified with default settings. Functions related to generating random genesis accounts and simulation operations have been updated or removed, indicating a shift in how simulations are configured and executed. Additionally, there are updates to network configuration and governance module simulations, reflecting a broader overhaul of the testing and simulation environment.

Changes

File(s) Summary
simapp/app.go, simapp/app_v2.go, tests/integration/.../deterministic_test.go, tests/integration/.../msg_server_test.go, tests/integration/.../infraction_test.go, tests/integration/.../example_test.go, tests/integration/.../keeper_test.go, tests/integration/.../common_test.go Removed cosmossdk.io/x/auth/simulation import and updated auth.NewAppModule initialization to exclude authsims.RandomGenesisAccounts.
simapp/testutil_network_test.go Added imports, initialized new global variables, modified SetupSuite, and added TestSimulation function.
testutil/network/network.go, x/auth/module.go Introduced new imports and fields, updated function signatures, and removed certain parameters and fields.
x/auth/simulation/genesis.go, x/auth/simulation/proposals.go, x/bank/module.go, x/bank/simulation/genesis.go, x/bank/simulation/operations.go, x/gov/module.go, x/gov/simulation/genesis.go, x/gov/simulation/operations.go Significant changes to function signatures and internal logic, removal of global constants, and refactoring of simulation functions.

Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

coderabbitai[bot] avatar Oct 23 '23 01:10 coderabbitai[bot]

I've rebased and pushed the current progress of the simulator, and described the approach and future direction in the last commit and the PR description. The first commit removes the simulator tests that lock in message content, which I don't think are worth their weight.

The PR is ready for review, but I've left it a draft because someone should decide whether to revert deletion of the replaced generators.

elias-orijtech avatar Nov 02 '23 23:11 elias-orijtech

Not sure what to do about the linter errors:

Error: ./testutil_network_test.go:132:128: cannot use v0app.GovKeeper (variable of type "cosmossdk.io/x/gov/keeper".Keeper) as *"cosmossdk.io/x/gov/keeper".Keeper value in argument to govsim.GenerateMsgSubmitProposal
Error: ./testutil_network_test.go:133:120: cannot use v0app.GovKeeper (variable of type "cosmossdk.io/x/gov/keeper".Keeper) as *"cosmossdk.io/x/gov/keeper".Keeper value in argument to voteGen.GenerateVotes
Error: ./testutil_network_test.go:138:122: cannot use v0app.GovKeeper (variable of type "cosmossdk.io/x/gov/keeper".Keeper) as *"cosmossdk.io/x/gov/keeper".Keeper value in argument to govsim.GenerateMsgDeposit
Error: ./testutil_network_test.go:144:127: cannot use v0app.GovKeeper (variable of type "cosmossdk.io/x/gov/keeper".Keeper) as *"cosmossdk.io/x/gov/keeper".Keeper value in argument to govsim.GenerateMsgVoteWeighted
Error: ./testutil_network_test.go:147:129: cannot use v0app.GovKeeper (variable of type "cosmossdk.io/x/gov/keeper".Keeper) as *"cosmossdk.io/x/gov/keeper".Keeper value in argument to govsim.GenerateMsgCancelProposal (typecheck)

They seem to be caused by a mismatch between the v1 and v2 simapp.SimApp.GovKeeper field. The v2 field is a pointer, whereas the v1 field is not.

elias-orijtech avatar Nov 02 '23 23:11 elias-orijtech

@tac0turtle I've rebased the PR on top of the recent testutil changes and re-enabled genesis state generation for the old simulator for compatibility. The missing piece is the v1 vs v2 simapp difference in the GovKeeper field.

elias-orijtech avatar Nov 09 '23 21:11 elias-orijtech

i think we may need to revert the removal of things in modules as we want the current simulator to still run, I believe there is still some work we need to do to make it sufficient to replace things. I can take ownership of this and push it over the finish line

tac0turtle avatar Nov 14 '23 18:11 tac0turtle

i think we may need to revert the removal of things in modules as we want the current simulator to still run, I believe there is still some work we need to do to make it sufficient to replace things. I can take ownership of this and push it over the finish line

Thank you very much. Another option is to leave this PR in draft and start working on a deterministic testutil/network based on your conversion to interfaces and integration of MockComet.

elias-orijtech avatar Nov 15 '23 16:11 elias-orijtech

hmm i cant get this test to fail locally, @elias-orijtech are you able to do it locally?

tac0turtle avatar Dec 09 '23 23:12 tac0turtle

Sorry for dropping this ball. Do you mean the failing simapp v1 test? If so, did you test with the app_v1 tag?

elias-orijtech avatar Dec 16 '23 20:12 elias-orijtech

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jan 16 '24 00:01 github-actions[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Feb 16 '24 00:02 github-actions[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 28 '24 00:03 github-actions[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 28 '24 00:04 github-actions[bot]