openzeppelin-contracts
openzeppelin-contracts copied to clipboard
Branchless choice, min and max methods
Description
Follow a Branchless version of Log2, Square Root, Min and Max functions, all of them consumes less gas than the current implementation, and have a constant gas cost independent of the input value, the square root gas saving is actually due an better and also branchless implementation of the log2 function.
Motivation
I'm working in the Analog Timechain, which is a cross-chain protocol that allow interoperability between different blockchains.
At some point our protocol needs for a given input calculate the exact execution gas cost, but this was hard due a lot of branching, branching that should not even exist, for example the Math.min
function use more or less gas when a < b
or a >= b
, either compiling in debug and optimized mode:
pragma solidity >=0.7.0 <0.9.0;
contract Math {
function min(uint256 a, uint256 b) external pure returns (uint256 result) {
result = a < b ? a : b;
}
}
a < b
consumes 306 gas
a >= b
consumes 316 gas
This was annoying because for every change in the contract I had to dive into the EVM assembly to understand how solidity generate different branches to understand how it affect the gas used. So I decided to take a different approach and make everything constant gas cost, as result I implemented the BranchlessMath.sol library, which have a constant gas cost for any input, and surprisingly I also noticed significant gas savings using those methods compared to the OpenZeppeling's Math library, so I decided to make it available to everyone by contributing here.
In comparison the branchless version reduces the final bytecode size, and also consumes less gas:
contract Math {
function min(uint256 a, uint256 b) external pure returns (uint256 result) {
assembly ("memory-safe") {
// min(a,b) = b ^ ((a ^ b) * (a < b))
result := xor(b, mul(xor(a, b), lt(a, b)))
}
}
}
Always consumes 291 gas
, independently if a < b
or a >= b
.
PR Checklist
- [x] Tests
- [x] Documentation
- [x] Changeset entry (run
npx changeset add
)
🦋 Changeset detected
Latest commit: dc6abbc43225ab99de93a871533852dc8441541c
The changes in this PR will be included in the next version bump.
This PR includes changesets to release 1 package
Name | Type |
---|---|
openzeppelin-solidity | Minor |
Not sure what this means? Click here to learn what changesets are.
Click here if you're a maintainer who wants to add another changeset to this PR
@Amxx thanks for the feedback, will remove some assembly at the cost of a consuming a bit more gas, but essentially what the sqrt does is:
function sqrt(uint256 a) public pure returns (uint256) {
unchecked {
// Approximate to the closest power of 2 of the square root. This is equivalent to set the MSB of
// the square root, this reduces Netwon's Method iterations required to get the final result.
uint256 xn = 2 ** (Math.log2(a) / 2);
// 7 iterations of Netwon's method
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
xn = (xn + a / xn) >> 1;
// Round down
// Obs: Using `Math.max` is equivalent to round to nearest, not round up.
xn = Math.min(xn, a / xn);
return xn;
}
}
Except that:
- The log2 used in the Sqrt function is a bit more optimized as it ignore the least significant bit, once the result will be divided by 2 anyway.
- The
Math.min(xn, a / xn)
is optimized once the difference betweenxn
anda / xn
is either 0 or 1, so it can be calculated asxn -= SafeCast.toUint(xn > a/xn)
;
This is almost exactly what we used to have
It got changed in a long and painfull (to review) PR. This included documenting a lot of math to prove (and not just feel) that the new version is correct (in addition to being cheaper).
If we are to go back on sqrt, I'll do it in a separate issue/PR. I'd rather split the review effort and the frustration. I really see a point to merging min/max decently fast ... and I fear sqrt will take much more discussion.
@Amxx makes sense, I removed the sqrt changes, I also removed all assembly code maintaining the same gas efficiency to min
and max
methods.
I made it a bit more generic by introducing the choice
function, which is the core principle on how the branchless min/max works, it can be used to replace any simple ternary operations.
Because it is a new method, I incremented the minor
instead of the patch
, but if you think we should not expose this function, I can make it private instead internal, even so I think it is quite useful.
Because it is a new method, I incremented the minor instead of the patch.
Anythink that is not a security fix goes into a minor anyway.
but if you think we should not expose this function, I can make the choice private, even so I think it is quite useful.
We'll discuss that.
Thank you for your effort ... though please don't be frustrated if this takes time to merge. Our processes are ususally considered slow by external contributors, but we fell that is how we achieve maximum quality/security.
When you do a ternary choice, the compiler can "optimize" it so that only one branch is computer. However, when you use a dedicated function, values for both side must be computed. This remove the opportunity for lazy evaluation.
I can make the select
method private or use it only in simple ternary operations, if we are unsure about expose it or not.
I don't think it should be private. We want to be able to use it in different places/libraries without re-declaring it.
I think it comes down to documentation that it is not always cheaper than a ternary, and that both branches are always evaluate.
Catching up with this. I'd like to highlight that @Lohann started this looking for constant cost in math operations. Although it's not the kind of priorities OpenZeppelin Contracts has, it's an honest requirement. I also agree with @Amxx with that we prefer more readable code.
I would consider creating an issue for branchless math if this requirement of constant costs ever comes back.
I'll be leaving comments while I read the conversation.
When you do a ternary choice, the compiler can "optimize" it so that only one branch is computer. However, when you use a dedicated function, values for both side must be computed. This remove the opportunity for lazy evaluation.
Is this true? I see there's an open issue about it: https://github.com/ethereum/solidity/issues/12930
Congrats, your important contribution to this open-source project has earned you a GitPOAP!
GitPOAP: 2024 OpenZeppelin Contracts Contributor:
data:image/s3,"s3://crabby-images/755df/755df8c60dc6211bf0f90be00d46e6a001775969" alt="GitPOAP: 2024 OpenZeppelin Contracts Contributor GitPOAP Badge"
Head to gitpoap.io & connect your GitHub account to mint!
Learn more about GitPOAPs here.