openzeppelin-contracts icon indicating copy to clipboard operation
openzeppelin-contracts copied to clipboard

Optimize `log2` with a lookup table

Open Lohann opened this issue 1 year ago • 4 comments

23% more efficient Log2 method

Follow an efficient implementation of floor(log2(x)) which uses ~23% less gas than the current implementation.

log2 is quite central for many other math operations, such as:

  • sqrt uses log2 internally (even so the current sqrt uses an specialized log2 implementation).
  • Count leading zeros: 255 - log2(value) + toUint(value == 0)
  • Estimate how many bytes or bits are necessary to store a given number
  • I use it a lot to perform Float-Point arithmetic in this library.

This implementation uses lookup tables to calculate floor(log2(x)) using 218 gas, which is 65 gas units less than the current implementation. Obs: the code in this PR have a more detailed documentation on how the algorithm works:

function log2(uint256 x) internal pure returns (uint256) {
    unchecked {
        // Round `x` down to the closest power of 2 using the Seander's algorithm.
        // Reference: https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
        x |= x >> 1;
        x |= x >> 2;
        x |= x >> 4;
        x |= x >> 8;
        x |= x >> 16;
        x |= x >> 32;
        x |= x >> 64;
        x |= x >> 128;
        // Note `x = 0` results in 1 here, but that's ok once floor(log2(0)) == floor(log2(1)) anyway.
        x = (x >> 1) + 1;

        uint256 prod0;
        {
            // Compute `n = x mod 255`, the resulting `n` can only be one of the following 
            // values: 1, 2, 4, 8, 16, 32, 64 or 128.
            uint256 n = x % 255;

            // compute `n % 11`, this maps all `n` values to an unique index between 0~31.
            uint256 index = n % 11;

            // Perform a table lookup, the table stores the result of `log2(n)` in
            // the corresponding byte index.
            assembly {
                prod0 := byte(index, 0x0000010002040007030605000000000000000000000000000000000000000000)
            }
        }

        // Compute `prod1 = log2(x / n)`, once `prod1` is always multiple of 8, we can
        // use an interesting lookup technique:  (x / n * table) >> 248 == log(x / n)
        uint256 prod1 = x >> prod0;
        prod1 *= 0x0008101820283038404850586068707880889098a0a8b0b8c0c8d0d8e0e8f0f8;
        prod1 >>= 248;

        // log2(x) = log2(x / n) + log2(n) = prod0 + prod1
        return prod0 + prod1;
    }
}

Lookup tables are quite powerful and should be explored more in Solidity, I use it in another project for validate 6 different method selectors in constant gas, using less than 100 gas units, way smaller and more efficient than the nested ifs or binary search generated by default by the Solidity Compiler: https://github.com/Lohann/universal-factory/blob/e6e3e6acc3f3b8a9bf0d1495c0f8fdb47d88a801/src/UniversalFactory.sol#L305-L346

PR Checklist

  • [x] Tests
  • [x] Documentation
  • [x] Changeset entry (run npx changeset add)

Lohann avatar Oct 03 '24 04:10 Lohann

⚠️ No Changeset found

Latest commit: 111a13f47b12deabbfcb4b96f160d4a3aecdb3eb

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Oct 03 '24 04:10 changeset-bot[bot]

@Lohann could you take a look at https://github.com/Lohann/openzeppelin-contracts/pull/1? cross sharing:

We can do a much more efficient and simple implementation by employing a binary search to quickly narrow down the most significant bit, followed by a small lookup table for the final 4 bits.

It's more performant as we avoid using multiple lookup tables and modulo.

cairoeth avatar Oct 16 '24 16:10 cairoeth

Great work @cairoeth ! your solution is cleaner, cheaper and generates a smaller binary, will update this PR with your proposal.

Lohann avatar Oct 17 '24 15:10 Lohann

@Lohann feel free to merge my PR into your branch, which should update this PR. thanks!

cairoeth avatar Oct 17 '24 15:10 cairoeth

Hi guys, sorry for the late reply got super busy here, but this seems amazing, thanks @cairoeth and @Amxx, this solutions is cleaner and better than my initial idea, and considering the optimal code I implemented here, it both minimizes the bytecode size and gas used 🎉

Lohann avatar Oct 29 '24 15:10 Lohann