feat: forge-std v1.0.0
Forge Standard Library v1.0.0
Forge-Std is now modular, enabling composability, and paving the way for new libraries for testing in Solidity.
Features
- #126
-
createCompute2Address - Improve
bound
Breaking changes
- Bump pragma to
>=0.6.2 <0.9.0 - Remove deprecated features (
tipstd-cheat,lowLevelErrorstd-error) - Disallow "unsafe" std-cheats in scripts by default https://github.com/foundry-rs/forge-std/pull/147
-
using stdStorage for StdStoragewill break in scripts and will need to be changed tostdStorageSafe(or will require a manual import ofstdStorage)
Before merging
- [x] Bump pragma to
0.6.2https://github.com/foundry-rs/forge-std/pull/147#discussion_r944974402 - [x] Check https://github.com/foundry-rs/forge-std/commit/2c7cbfc6fbede6d7c9e6b17afe997e3fdfe22fef and later commits for any necessary changes (e.g. std-cheats should be
internal virtual)
Just tested locally in a big project I had, all went smoothly! Going to ask a few others to test in their projects and will report back
Tested on my largest project. Works fine.
@ZeroEkkusu I tried moving test dir to the root in f21ef1a to match the convention of forge init and most projects these days, any idea why that broke CI? Don't have time to check at the moment, so pushed another commit to revert that for now
It's as if --contracts src/Test.sol in forge build --contracts src/Test.sol --use solc:0.8.0 has no effect, and it compiles all contracts.
Edit: forge build compiles contracts (default: src/), tests (default: test/), and scripts (default: script/). Having tests in src/test/ and specifying the contracts location with --contracts src/Test.sol allowed us to bypass this and compile only Test.sol with forge build.
Ah makes sense. There is now a --skip flag (https://github.com/foundry-rs/foundry/pull/3370) so we should be able to forge build --skip test --use solc:0.8.0 instead now. I should be able to get to this later today
@ZeroEkkusu wdyt of also running forge fmt + adding to CI (per https://github.com/foundry-rs/forge-std/pull/183) in this PR? That way once it's merged v1.0.0 is done
Sure @mds1.
- Took the default
fmtvalues and defined them explicitly infoundry.toml(if left implicit, a contributor with their own config in~/.foundry/foundry.tomlwould end up applying their config instead of the default) - Ran
forge fmt - Split CI into separate jobs since now there's a few things going on
I think all that's left before merge is some more testing. Just tested the latest commit on this branch in some repos and it worked great. Would love for @nventuro to test since he has an 0.7.6 repo
Thanks for the ping @mds1! Just run our tests using the v0.3 branch and everything seems to work correctly.
Actually, after further testing I'm seeing some changes in behavior when using commit dc8d830 when compared to f36dab2
testExitSwaps fails with too many vm.assume rejects. I imagine that could be due to bound behaving differently, reuslting in a different random variable distribution.
testUpdateCachedRatios also fails, though this one is more bizarre. The assertion that fails is a comparison of gas costs, which seems to be unrelated to forge-std entirely as we only call our own code inside the gasleft() blocks.
I've switched back and forth between commits dc8d830d3af6f67758f793ce42572270fb0f41fb (this branch) and f36dab2, and consistently get errors only on the updated commit.
tested in our 0.8.16 and 0.6.12 project, works great 👍 looking forward to this
It’s a very good solution @mds1! You should probably PR this to Solmate. Cc @transmissions11.
Notes: We need to address @0xPhaze’s comments, and update all revert messages (e.g. the one in bound says it’s from Test, but it’s from StdUtils).
Actually, after further testing I'm seeing some changes in behavior when using commit dc8d830 when compared to f36dab2
testExitSwapsfails with too manyvm.assumerejects. I imagine that could be due toboundbehaving differently, reuslting in a different random variable distribution.
testUpdateCachedRatiosalso fails, though this one is more bizarre. The assertion that fails is a comparison of gas costs, which seems to be unrelated toforge-stdentirely as we only call our own code inside thegasleft()blocks.I've switched back and forth between commits dc8d830d3af6f67758f793ce42572270fb0f41fb (this branch) and f36dab2, and consistently get errors only on the updated commit.
@nventuro I was able to reproduce those failures. Some new failures due to the different/improved bound behavior is a good thing since it confirms this is now working better. However, if I keep forge-std on this branch but overwrite bound to use the old version, I still see the failures, so it seems bound isn't the cause (or maybe is partly the cause, but looking at your tests I don't know what else it would be if it's not bound) 🤔
@nventuro Had some time on a flight without wifi so figured this was a good offline debug task. Here's what I found:
testUpdateCachedRatios
Below is a concrete test that will pass with forge-std on master but fail with v0.3 (well, v1.0.0 technically but calling it v0.3 since that's this branch name).
The difference is in the newWeight = bound(newWeight, _MINIMUM_BOUND_PERCENTAGE, FixedPoint.ONE); line.
The first four parameters that we bound all have the same output value on master and v0.3. However, the last parameter, newWeight, differs between v0.3 and master. Since newWeight = type(uint256).max in v0.3 it now gets wrapped to the upper bound of the range, which is FixedPoint.ONE. On master it was wrapped to 477686011984099564. Presumably this results in a different code path or different storage values since the dynamic costs vary significantly as shown below:
cachedCost, master: 2534
cachedCost, v0.3: 2526
dynamicCost, master: 7702
dynamicCost, v0.3: 2892
If I change the upper bound for newWeight from FixedPoint.ONE to FixedPoint.ONE-1 the test (including the fuzz variant) always passes on v0.3. I'm not sure if this is logical/correct in this test, however based on these findings I don't think there is an issue with bound as the results are valid, the test probably just needs to be updated to account for the new bound behavior.
function testUpdateCachedRatiosFailure() public {
// Hardcode inputs from the failure case found by the fuzzer on the v0.3 branch.
uint256 bptPrice = 0;
uint256 referenceWeight = 0;
uint256 newWeight = type(uint256).max;
uint256 lowerBound = 0;
uint256 upperBound = 0;
// Pass them through bound.
bptPrice = bound(bptPrice, _MIN_BPT_PRICE, _MAX_BPT_PRICE);
lowerBound = bound(lowerBound, _MINIMUM_BOUND_PERCENTAGE, FixedPoint.ONE);
upperBound = bound(upperBound, lowerBound, _MAX_BOUND_PERCENTAGE);
referenceWeight = bound(referenceWeight, _MINIMUM_TOKEN_WEIGHT, _MAXIMUM_TOKEN_WEIGHT);
newWeight = bound(newWeight, _MINIMUM_BOUND_PERCENTAGE, FixedPoint.ONE);
// Log them for convenience.
console2.log("bptPrice", bptPrice);
console2.log("lowerBound", lowerBound);
console2.log("upperBound", upperBound);
console2.log("referenceWeight", referenceWeight);
console2.log("newWeight", newWeight);
// Convert the bound statements in the fuzz test to assumes as a sanity check that
// the bound outputs are valid.
vm.assume(bptPrice >= _MIN_BPT_PRICE && bptPrice <= _MAX_BPT_PRICE);
vm.assume(lowerBound >= _MINIMUM_BOUND_PERCENTAGE && lowerBound <= FixedPoint.ONE);
vm.assume(upperBound >= lowerBound && upperBound <= _MAX_BOUND_PERCENTAGE);
vm.assume(referenceWeight >= _MINIMUM_TOKEN_WEIGHT && referenceWeight <= _MAXIMUM_TOKEN_WEIGHT);
vm.assume(newWeight >= _MINIMUM_BOUND_PERCENTAGE && newWeight <= FixedPoint.ONE);
// Call the fuzz test with our parameters.
testUpdateCachedRatios(bptPrice, referenceWeight, newWeight, lowerBound, upperBound);
}
testExitSwaps
Similar to above, the new bound behavior just results in more test rejections.
On average I get around 70% of the way through the 10k runs before erroring with too many rejects. (The 60k reject limit you currently have is slightly less than the default of 65536). So one solution is to simply bump the max rejects limit and deal with longer test times. I tried setting max_test_rejects = 100000 and it seems to consistently pass. On master this test passes in about 10 seconds, on v0.3 it'll fail after ~35s, and with the 100k rejects it passes after ~50s.
An alternative idea is that you may be able to get a bit more clever with how you use bound to ensure the requirement of amountOut.divUp(FixedPoint.ONE - swapFeePercentage) <= balances[tokenIndex] always holds. Modifying the swapFeePercentage bound seems like the easiest way to do this, e.g. something like this (not certain it's correct, make sure to verify if you go this route):
amountOut = bound(amountOut, 0, balances[tokenIndex]);
bptTotalSupply = bound(bptTotalSupply, _DEFAULT_MINIMUM_BPT, type(uint112).max);
swapFeePercentage = bound(swapFeePercentage, 0, 0.99e18);
// This exit type is special in that fees are charged on the amount out. This creates scenarios in which the
// total amount out (including fees) exceeds the Pool's balance, which will lead to reverts. We reject any runs
// that result in this edge case.
bool isValid = amountOut.divUp(FixedPoint.ONE - swapFeePercentage) <= balances[tokenIndex];
if (!isValid) {
// To speed up tests by minimizing rejections, modify the swapFeePercentage to ensure
// the validity check holds.
swapFeePercentage = balances[tokenIndex] - amountOut - 1;
isValid = amountOut.divUp(FixedPoint.ONE - swapFeePercentage) <= balances[tokenIndex];
}
vm.assume(isValid);
TLDR
I think the branch is behaving as expected and some tests need to be updated to account for the new behavior
Just dropped a new bound implementation, @mds1. This one has even number distribution.
For bound(x, 15, 18):
Previously
15 16 17 18 15 16 17 18 15 16 17 18 15 16 17 15 16 17 18 18 15
--------------------------------------------------------------
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Now
15 16 17 18 16 17 18 15 16 17 18 15 16 17 18 15 16 17 18 15 16
--------------------------------------------------------------
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
🔎 Notice how
min - 1ismaxandmax + 1isminwith the new implementation:18 15 16 17 18 15 ----------------- 14 15 16 17 18 19
We should test it more in practice to make sure there are no errors.
I'll check this out and put a simple script together to test the distributions before and after to make sure things are good / the new complexity (new conditionals) are worth it.
Going forward let's make these changes as PRs into this branch instead of committing directly to it—I think the separation is cleaner and makes it easier to discuss changes without resulting in one giant PR, plus if we decide to not implement a change we just close the PR instead of force pushing or adding commits to revert something
Ok @ZeroEkkusu I think this implementation looks good.
We should test it more in practice to make sure there are no errors.
Did you plan on writing these? Wondering what additional tests we should add. A bit tedious, but perhaps we hardcode an array of [0, 1, 2, 3, ..., 48, 49, 50], and loop over that array for a few different min/max ranges (or a smaller array so it's more manageable)
Good idea, I'll add one.
Thoughts on the merge after #195?
Awesome, thanks.
Once we get #195 merged I want to do one more quick round of testing by pinging a few people again (since we made some additional changes), then good to merge if that goes smoothly
Ok, this is finally just about ready to be merged, and just updated the description to link the relevant new PRs. I'm going to test this once more on some repos, @hexonaut @nventuro if you want to test again that would be appreciated
Ok just tested in two repos:
- First one had tests all passing
- One had some failures, but these seem ok since they're due to
bound, and some failures are expected there
Just tested on one of our larger repos. Works fine.