Optimize `log2` with a lookup table
23% more efficient Log2 method
Follow an efficient implementation of floor(log2(x)) which uses ~23% less gas than the current implementation.
log2 is quite central for many other math operations, such as:
sqrtuses log2 internally (even so the currentsqrtuses an specialized log2 implementation).- Count leading zeros:
255 - log2(value) + toUint(value == 0) - Estimate how many bytes or bits are necessary to store a given number
- I use it a lot to perform Float-Point arithmetic in this library.
This implementation uses lookup tables to calculate floor(log2(x)) using 218 gas, which is 65 gas units less than the current implementation.
Obs: the code in this PR have a more detailed documentation on how the algorithm works:
function log2(uint256 x) internal pure returns (uint256) {
unchecked {
// Round `x` down to the closest power of 2 using the Seander's algorithm.
// Reference: https://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x |= x >> 32;
x |= x >> 64;
x |= x >> 128;
// Note `x = 0` results in 1 here, but that's ok once floor(log2(0)) == floor(log2(1)) anyway.
x = (x >> 1) + 1;
uint256 prod0;
{
// Compute `n = x mod 255`, the resulting `n` can only be one of the following
// values: 1, 2, 4, 8, 16, 32, 64 or 128.
uint256 n = x % 255;
// compute `n % 11`, this maps all `n` values to an unique index between 0~31.
uint256 index = n % 11;
// Perform a table lookup, the table stores the result of `log2(n)` in
// the corresponding byte index.
assembly {
prod0 := byte(index, 0x0000010002040007030605000000000000000000000000000000000000000000)
}
}
// Compute `prod1 = log2(x / n)`, once `prod1` is always multiple of 8, we can
// use an interesting lookup technique: (x / n * table) >> 248 == log(x / n)
uint256 prod1 = x >> prod0;
prod1 *= 0x0008101820283038404850586068707880889098a0a8b0b8c0c8d0d8e0e8f0f8;
prod1 >>= 248;
// log2(x) = log2(x / n) + log2(n) = prod0 + prod1
return prod0 + prod1;
}
}
Lookup tables are quite powerful and should be explored more in Solidity, I use it in another project for validate 6 different method selectors in constant gas, using less than 100 gas units, way smaller and more efficient than the nested ifs or binary search generated by default by the Solidity Compiler:
https://github.com/Lohann/universal-factory/blob/e6e3e6acc3f3b8a9bf0d1495c0f8fdb47d88a801/src/UniversalFactory.sol#L305-L346
PR Checklist
- [x] Tests
- [x] Documentation
- [x] Changeset entry (run
npx changeset add)
⚠️ No Changeset found
Latest commit: 111a13f47b12deabbfcb4b96f160d4a3aecdb3eb
Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.
This PR includes no changesets
When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types
Click here to learn what changesets are, and how to add one.
Click here if you're a maintainer who wants to add a changeset to this PR
@Lohann could you take a look at https://github.com/Lohann/openzeppelin-contracts/pull/1? cross sharing:
We can do a much more efficient and simple implementation by employing a binary search to quickly narrow down the most significant bit, followed by a small lookup table for the final 4 bits.
It's more performant as we avoid using multiple lookup tables and modulo.
Great work @cairoeth ! your solution is cleaner, cheaper and generates a smaller binary, will update this PR with your proposal.
@Lohann feel free to merge my PR into your branch, which should update this PR. thanks!
Hi guys, sorry for the late reply got super busy here, but this seems amazing, thanks @cairoeth and @Amxx, this solutions is cleaner and better than my initial idea, and considering the optimal code I implemented here, it both minimizes the bytecode size and gas used 🎉