foundry icon indicating copy to clipboard operation
foundry copied to clipboard

feature(invariant) - persist and replay failure

Open grandizzy opened this issue 9 months ago • 6 comments

Motivation

Partially address #2552 - persist and replay shrinked failed invariant sequence CC @mds1

  • when invariant fails, the shrinked sequence is persisted and replayed on subsequent runs. If the sequence still fails then test exists immediately. Replaying failed sequence is done with current configs (for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)
  • only the last failed sequence is persisted, the default file is PROJ_ROOT/cache/invariant/failures/{TEST_SUITE_NAME}/{INVARIANT_NAME}, layout of cache dir becoming image
  • the default failure dir can be changed in foundry.toml
[invariant]
failure_persist_dir="/path/to/failure/dir"

or by using inline config

/// forge-config: default.invariant.failure-persist-dir = /path/to/failure/dir
  • currently the sequence is persisted in a proprietary format (but could be easy to change to a standard corpus one when available, see https://github.com/crytic/medusa/issues/234#issuecomment-1865290883). For example, the transfer ownership in 2 steps shrinked sequence is saved as
[
  {
    "sender": "0x0000000000000000000000000000000000000003",
    "addr": "0x2e234dae75c793f67a35089c9d99245e1c58470b",
    "calldata": "0x6d4354210000000000000000000000007fa9385be102ac3eac297483dd6233d62b3e14960000000000000000000000000000000000000000000000000000000000000e72",
    "contract_name": "test/Owned.t.sol:Handler",
    "signature": "transferOwnership(address,address)",
    "args": "0x7FA9385bE102ac3EAc297483Dd6233D62b3e1496, 0x0000000000000000000000000000000000000e72"
  },
  {
    "sender": "0x0000000000000000000000000000000000000001",
    "addr": "0x2e234dae75c793f67a35089c9d99245e1c58470b",
    "calldata": "0x51710e450000000000000000000000000000000000000000000000000000000000000e72",
    "contract_name": "test/Owned.t.sol:Handler",
    "signature": "acceptOwnership(address)",
    "args": "0x0000000000000000000000000000000000000e72"
  }
]

Solution

  • add option of failure dir in invariant config, defaults to cache/invariant
  • remove dir on forge clean
  • run invariant check only once when replaying - checking after each call doesn't add valuable info for passing scenario (invariant call result is always success) nor for failed scenarios (invariant call result is always success until the last call that breaks it)
  • reuse check_sequence shrink utility to check persisted failure
  • store args as String in BaseCounterExample so we can serialize / deserialize it and use in replayed failed counterexample
  • if file with failed scenario exists, load and check sequence, exit if it still fails, continue on success. If fail doesn't exist or data cannot be loaded then file is ignored and test continues. Only last failed scenario is persisted in failure file. If failed scenario fails to pe persisted then error is displayed
  • added tests to check failure replay, move duped code for getting counterexample in get_counterexample macro
  • added test to make sure dir is removed on forge clean

grandizzy avatar May 09 '24 10:05 grandizzy

This is great!

(for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)

In this case, are we only running the first 50 steps of the sequence which results in it being passed? I think we may still want the full 100 step sequence to run to avoid thinking my test is now passing when it reality it doesn't

PROJ_ROOT/cache/invariant/failures/{TEST_NAME}/{INVARIANT_NAME}

Two small comments on this path:

  1. I think TEST_NAME is a typo and you meant CONTRACT_NAME?
  2. What if two invariants have the same name but different signatures, e.g. invariant1(uint256 x) and invariant1(uint256 x, uint256 y)? The file name probably needs to be the full normalized signature (selector also works, but is harder to less user-friendly)

mds1 avatar May 17 '24 14:05 mds1

This is great!

(for example if the persisted sequence fails in 100 steps but config was altered to run with a depth of 50 then persisted sequence will pass)

In this case, are we only running the first 50 steps of the sequence which results in it being passed? I think we may still want the full 100 step sequence to run to avoid thinking my test is now passing when it reality it doesn't

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here https://github.com/crytic/echidna/issues/1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

PROJ_ROOT/cache/invariant/failures/{TEST_NAME}/{INVARIANT_NAME}

Two small comments on this path:

  1. I think TEST_NAME is a typo and you meant CONTRACT_NAME?

Yeah, I meant test suite (that is the test contract)

  1. What if two invariants have the same name but different signatures, e.g. invariant1(uint256 x) and invariant1(uint256 x, uint256 y)? The file name probably needs to be the full normalized signature (selector also works, but is harder to less user-friendly)

Rn we don't support params in invariant functions, is something that was requested in https://github.com/foundry-rs/foundry/issues/4834 but no resolution for it yet

grandizzy avatar May 17 '24 14:05 grandizzy

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here https://github.com/crytic/echidna/issues/1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

Thanks, I left a comment there to better understand

Rn we don't support params in invariant functions

Oh right, forgot about that 😅

mds1 avatar May 17 '24 14:05 mds1

LGTM

I think there can be potential issues with fail_on_revert set to true. If user changes setUp routines, then persisted sequence might become invalid because contract addresses will change and cached calldata might be invalid for a given address.

For example, if persisted sequence is

1. call `foo()` on contract `A`

2. call `bar()` on contract `B`

then adding more deployments to setUp will cause addresses of contract A/B change due to a deployer nonce shift, and sequence above might start calling bar() on A or vice versa.

Not sure how we can address that, perhaps display some kind of warning when persisted sequence is reverting rather than failing an invariant?

good point, a simpler scenario would be renaming handler functions, like foo() on contract A to become foo1(). Will display something like replay reverted if such to avoid confusion

grandizzy avatar May 17 '24 15:05 grandizzy

Not sure how we can address that, perhaps display some kind of warning when persisted sequence is reverting rather than failing an invariant?

@klkvr please check https://github.com/foundry-rs/foundry/pull/7899/commits/db1b619ef2464c32f3af87b160077894376903c8

grandizzy avatar May 17 '24 17:05 grandizzy

Yes, that's correct, there'll only be first 50 runs resulting in test pass. That was done to avoid situations like the one described here crytic/echidna#1231 where changes in configs are not applied on failure reply. Happy to change it if you think should be.

Thanks, I left a comment there to better understand

@mds1 I think I am on the same page as Rappie mentioned in https://github.com/crytic/echidna/issues/1231#issuecomment-2117839608 In foundry we have plans to add regression test generation where we capture all the conditions to replicate failure (test driver solidity code, env vars used at the time of test running, etc.). This should avoid the impression that test is passing when it reality it doesn't, wdyt?

grandizzy avatar May 19 '24 06:05 grandizzy

@mds1 I think I am on the same page as Rappie mentioned in https://github.com/crytic/echidna/issues/1231#issuecomment-2117839608 In foundry we have plans to add regression test generation where we capture all the conditions to replicate failure (test driver solidity code, env vars used at the time of test running, etc.). This should avoid the impression that test is passing when it reality it doesn't, wdyt?

From rappie in that issue:

the config leads, meaning sequences in the corpus which are illegal according to the current config should be skipped.

That does make sense to me, I'm onboard 👌

mds1 avatar May 20 '24 21:05 mds1