solidity icon indicating copy to clipboard operation
solidity copied to clipboard

Optimize ternary expressions to branchless code when cheaper.

Open transmissions11 opened this issue 3 years ago • 5 comments

Abstract

The ability to cast bools to uint256 or uint8.

Motivation

Branchless optimizations are becoming more and more common to see lately, but often they require dropping down into inline assembly since their efficiency depends on being able to treat a bool as a uint that resolves to either 1 in the true case or 0 if false. Solidity should support performing these conversions without jumps or branches as bools are already represented this way under the hood. Requiring devs to drop into assembly is presenting them with a bunch of extra footguns to hurt themselves with, adding this ability to the high level language will make Solidity safer for devs and make these optimization techniques more readable.

Specification

I am not opinionated as to whether bools would have to be cast to uint8s or uint256s assuming they use the same amount of gas. I will demonstrate both possible options:

bool status = false;
assert(uint256(status) == 0);
bool status = true;
assert(uint8(status) == uint8(1));

Backwards Compatibility

I don't think this would introduce any backwards compatibility issues, but please correct me if so!

transmissions11 avatar Apr 13 '22 03:04 transmissions11

Rather than allowing this cast, we'd prefer working towards improving code generation and the optimizer to have status ? 1 : 0 result in branchless code to begin with, i.e. have the solidity optimizer perform the branchless optimization you want on its own. That would help much more generally.

Do you happen to have some examples at hand where such a cast is performed in inline assembly?

ekpyron avatar Apr 13 '22 08:04 ekpyron

This function is a nice example, the assembly block exists only because we want to go from lt(-,-) to 0 or 1 efficiently:

    function ilog2(uint256 x) internal returns (uint256 r) {
        assembly {
            r := shl(7, lt(0xffffffffffffffffffffffffffffffff, x))
            r := or(r, shl(6, lt(0xffffffffffffffff, shr(r, x))))
            r := or(r, shl(5, lt(0xffffffff, shr(r, x))))
            r := or(r, shl(4, lt(0xffff, shr(r, x))))
            r := or(r, shl(3, lt(0xff, shr(r, x))))
            r := or(r, shl(2, lt(0xf, shr(r, x))))
            r := or(r, shl(1, lt(0x3, shr(r, x))))
            r := or(r, lt(0x1, shr(r, x)))
        }
    }

It would be great if this could be rewritten at same or better gas cost as:

       r  = ((0xffffffffffffffffffffffffffffffff < x) ? 1 : 0) << 7;
       r |= ((0xffffffffffffffff < (x >> r)) ? 1 : 0) << 6;
       r |= ((0xffffffff < (x >> r)) ? 1 : 0) << 5;
       r |= ((0xffff < (x >> r)) ? 1 : 0) << 4;
       // ...

It would be awesome if it could be rewritten at same or better gas cost as:

       r  = (0xffffffffffffffffffffffffffffffff < x) ? 127 : 0;
       r |= (0xffffffffffffffff < (x >> r)) ? 64 : 0;
       r |= (0xffffffff < (x >> r)) ? 32 : 0;
       r |= (0xffff < (x >> r)) ? 8 : 0;
       // ...

But that requires a more complex optimization rule where the compiler realizes it can turn ? 2**n : 0 into ? 1 : 0 with a cheap bitshift.

recmo avatar Apr 13 '22 18:04 recmo

Think about the last rule some more. If the optimization pass turns x ? a : b into (x ? a: 1) * (a - b) + b and existing optimization passes elliminate + 0 and reduce * 2**n into << n than this would be taken care of!

recmo avatar Apr 13 '22 18:04 recmo

I'm repurposing this issue for the optimization that would make the original suggestion obsolete.

ekpyron avatar Sep 14 '22 15:09 ekpyron

I introduced some branchless optimizations to openzeppelin, would be nice to have this at compiler level: https://github.com/OpenZeppelin/openzeppelin-contracts/pull/4976

Generic Branchless ternary operation, takes 3 inputs a b cond and returns cond ? a : b:

PUSH1 0xaa // a
PUSH1 0xbb // b a
PUSH1 0x00 // cond b a

// make sure condition is either 0 or 1
ISZERO

SWAP1      // a cond b
DUP3       // a b cond a
XOR        // a^b cond a
MUL        // a^b*cond a
XOR

If a or b is statically known to be equals zero, it can be simplified to mul(cond, a) or mul(iszero(cond), b)

Lohann avatar Jun 18 '24 22:06 Lohann