foundry
foundry copied to clipboard
Improve Performance of Invariant Testing
Component
Forge
Describe the feature you would like
I would like invariant tests (foundry's stateful fuzz tests) to be a bit more performant. While recently doing a larger fuzzing campaign on my local computer I noticed that foundry did not utilize all the threads of my CPU to perform the fuzzing. It would be great if I could select or set a flag to just use all available threads on my computer. Furthermore other performance improvements would be appreciated.
Additional context
No response
foundry did not utilize all the threads of my CPU to perform the fuzzing
interesting cc optimizooor @DaniPopes
Related - I noticed that having 10x separate invocations of 1_000 runs:
FOUNDRY_INVARIANT_RUNS=1000 forge test
Was far quicker (roughly 4.7x quicker) than one single invocation of 10_000 runs:
FOUNDRY_INVARIANT_RUNS=10000 forge test
When I would have expected it to be roughly the same?
additional perf improvements were added, noticeably https://github.com/foundry-rs/foundry/pull/7756 could you please recheck and see if that helped? 🙏 @Philogy @frontier159
I would like invariant tests (foundry's stateful fuzz tests) to be a bit more performant. While recently doing a larger fuzzing campaign on my local computer I noticed that foundry did not utilize all the threads of my CPU to perform the fuzzing. It would be great if I could select or set a flag to just use all available threads on my computer. Furthermore other performance improvements would be appreciated.
@Philogy you can do this by setting RAYON_NUM_THREADS
env to the number of threads you want to use, e.g.
RAYON_NUM_THREADS=100 forge test
to use 100 threads. Pls give it a try and lmk how this works
additional perf improvements were added, noticeably #7756 could you please recheck and see if that helped? 🙏 @Philogy @frontier159
Sorry I missed this @grandizzy . It's certainly much snappier now for my use case (once cached), although offline is still a lot quicker
forge test
Ran 155 test suites in 51.12s (62.27s CPU time): 1181 tests passed, 0 failed, 0 skipped (1181 total tests)
forge test --offline
Ran 155 test suites in 25.17s (65.08s CPU time): 1181 tests passed, 0 failed, 0 skipped (1181 total tests)
I'm not sure if caching could be improved a little, or if that's the best which can be done given the cache needs to be checked vs remote to see if it is stale.
@frontier159 thank you, will check the offline / online improvements, do you have a simple test I could use to debug? Re runs with higher depth being slow - I think there is room for big improvement if I am not missing something (going to make some tests), rn we use prop test runner and call run fn - for this we have to wrap many data in RefCell which could slow down runtime if I am not wrong and explain why invariants with bigger depth are getting slower and slower https://github.com/foundry-rs/foundry/blob/a117fbfa41edbaa1618ed099d78d65727bff4790/crates/evm/evm/src/executors/invariant/mod.rs#L137-L184
Per my understanding, we need runner only if we want prop test to shrink the failed sequence, which is not true for invariant tests (we do shrink outside), so instead calling runner run we can loop runs and draw values from strategy (as explained in https://altsysrq.github.io/proptest-book/proptest/tutorial/strategy-basics.html vs https://altsysrq.github.io/proptest-book/proptest/tutorial/test-runner.html). This could allow us to get rid of RefCells and probably have better performance
going to try this, @DaniPopes @klkvr @mattsse wdyt?
refcell is very cheap, very unlikely it has any performance impact.
@frontier159 thank you, will check the offline / online improvements, do you have a simple test I could use to debug?
Actually I think you can ignore this, I just tested again and it was ok now. Perhaps I had a dodgy wifi connection and/or cache miss. Awesome contribution thanks very much!
@mds1 do we want to expose a new config for the number of threads to run tests or just document that it can be changed by using RAYON_NUM_THREADS
env var?
Confirming it's also linear runtime for one run of FOUNDRY_INVARIANT_RUNS=1000
vs 10 runs of FOUNDRY_INVARIANT_RUNS=100
do we want to expose a new config for the number of threads to run tests or just document that it can be changed by using
RAYON_NUM_THREADS
env var?
We probably should expose a config var here, to avoid leaking library info into the UX. It sounds like this can be a generic max_threads
(or similar name) flag that's used everywhere we parallelize with rayon, which is preferable to an invariant-specific thread config