openzeppelin-contracts icon indicating copy to clipboard operation
openzeppelin-contracts copied to clipboard

Branchless choice, min and max methods

Open Lohann opened this issue 10 months ago • 5 comments

Description

Follow a Branchless version of Log2, Square Root, Min and Max functions, all of them consumes less gas than the current implementation, and have a constant gas cost independent of the input value, the square root gas saving is actually due an better and also branchless implementation of the log2 function.

Motivation

I'm working in the Analog Timechain, which is a cross-chain protocol that allow interoperability between different blockchains.

At some point our protocol needs for a given input calculate the exact execution gas cost, but this was hard due a lot of branching, branching that should not even exist, for example the Math.min function use more or less gas when a < b or a >= b, either compiling in debug and optimized mode:

pragma solidity >=0.7.0 <0.9.0;

contract Math {
    function min(uint256 a, uint256 b) external pure returns (uint256 result) {
        result = a < b ? a : b;
    }
}

a < b consumes 306 gas a >= b consumes 316 gas

This was annoying because for every change in the contract I had to dive into the EVM assembly to understand how solidity generate different branches to understand how it affect the gas used. So I decided to take a different approach and make everything constant gas cost, as result I implemented the BranchlessMath.sol library, which have a constant gas cost for any input, and surprisingly I also noticed significant gas savings using those methods compared to the OpenZeppeling's Math library, so I decided to make it available to everyone by contributing here.

In comparison the branchless version reduces the final bytecode size, and also consumes less gas:

contract Math {
    function min(uint256 a, uint256 b) external pure returns (uint256 result) {
        assembly ("memory-safe") {
            // min(a,b) = b ^ ((a ^ b) * (a < b))
            result := xor(b, mul(xor(a, b), lt(a, b)))
        }
    }
}

Always consumes 291 gas, independently if a < b or a >= b.

PR Checklist

  • [x] Tests
  • [x] Documentation
  • [x] Changeset entry (run npx changeset add)

Lohann avatar Mar 27 '24 01:03 Lohann

🦋 Changeset detected

Latest commit: dc6abbc43225ab99de93a871533852dc8441541c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
openzeppelin-solidity Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

changeset-bot[bot] avatar Mar 27 '24 01:03 changeset-bot[bot]

@Amxx thanks for the feedback, will remove some assembly at the cost of a consuming a bit more gas, but essentially what the sqrt does is:

function sqrt(uint256 a) public pure returns (uint256) {
  unchecked {
    // Approximate to the closest power of 2 of the square root. This is equivalent to set the MSB of
    // the square root, this reduces Netwon's Method iterations required to get the final result.
    uint256 xn = 2 ** (Math.log2(a) / 2);
    
    // 7 iterations of Netwon's method
    xn = (xn + a / xn) >> 1;
    xn = (xn + a / xn) >> 1;
    xn = (xn + a / xn) >> 1;
    xn = (xn + a / xn) >> 1;
    xn = (xn + a / xn) >> 1;
    xn = (xn + a / xn) >> 1; 
    xn = (xn + a / xn) >> 1;
    
    // Round down
    // Obs: Using `Math.max` is equivalent to round to nearest, not round up.
    xn = Math.min(xn, a / xn);
    
    return xn;
  }
}

Except that:

  • The log2 used in the Sqrt function is a bit more optimized as it ignore the least significant bit, once the result will be divided by 2 anyway.
  • The Math.min(xn, a / xn) is optimized once the difference between xn and a / xn is either 0 or 1, so it can be calculated as xn -= SafeCast.toUint(xn > a/xn);

Lohann avatar Mar 27 '24 22:03 Lohann

This is almost exactly what we used to have

It got changed in a long and painfull (to review) PR. This included documenting a lot of math to prove (and not just feel) that the new version is correct (in addition to being cheaper).

If we are to go back on sqrt, I'll do it in a separate issue/PR. I'd rather split the review effort and the frustration. I really see a point to merging min/max decently fast ... and I fear sqrt will take much more discussion.

Amxx avatar Mar 28 '24 09:03 Amxx

@Amxx makes sense, I removed the sqrt changes, I also removed all assembly code maintaining the same gas efficiency to min and max methods.

I made it a bit more generic by introducing the choice function, which is the core principle on how the branchless min/max works, it can be used to replace any simple ternary operations. Because it is a new method, I incremented the minor instead of the patch, but if you think we should not expose this function, I can make it private instead internal, even so I think it is quite useful.

Lohann avatar Mar 28 '24 14:03 Lohann

Because it is a new method, I incremented the minor instead of the patch.

Anythink that is not a security fix goes into a minor anyway.

but if you think we should not expose this function, I can make the choice private, even so I think it is quite useful.

We'll discuss that.

Thank you for your effort ... though please don't be frustrated if this takes time to merge. Our processes are ususally considered slow by external contributors, but we fell that is how we achieve maximum quality/security.

Amxx avatar Mar 28 '24 14:03 Amxx

When you do a ternary choice, the compiler can "optimize" it so that only one branch is computer. However, when you use a dedicated function, values for both side must be computed. This remove the opportunity for lazy evaluation.

I can make the select method private or use it only in simple ternary operations, if we are unsure about expose it or not.

Lohann avatar Apr 08 '24 13:04 Lohann

I don't think it should be private. We want to be able to use it in different places/libraries without re-declaring it.

I think it comes down to documentation that it is not always cheaper than a ternary, and that both branches are always evaluate.

Amxx avatar Apr 08 '24 14:04 Amxx

Catching up with this. I'd like to highlight that @Lohann started this looking for constant cost in math operations. Although it's not the kind of priorities OpenZeppelin Contracts has, it's an honest requirement. I also agree with @Amxx with that we prefer more readable code.

I would consider creating an issue for branchless math if this requirement of constant costs ever comes back.

I'll be leaving comments while I read the conversation.

ernestognw avatar Apr 22 '24 16:04 ernestognw

When you do a ternary choice, the compiler can "optimize" it so that only one branch is computer. However, when you use a dedicated function, values for both side must be computed. This remove the opportunity for lazy evaluation.

Is this true? I see there's an open issue about it: https://github.com/ethereum/solidity/issues/12930

ernestognw avatar Apr 22 '24 17:04 ernestognw

Congrats, your important contribution to this open-source project has earned you a GitPOAP!

GitPOAP: 2024 OpenZeppelin Contracts Contributor:

GitPOAP: 2024 OpenZeppelin Contracts Contributor GitPOAP Badge

Head to gitpoap.io & connect your GitHub account to mint!

Learn more about GitPOAPs here.

gitpoap-bot[bot] avatar Apr 23 '24 12:04 gitpoap-bot[bot]