solidity-bytes-utils
solidity-bytes-utils copied to clipboard
Use keccak to compare arrays
BytesLib.equal
and BytesLib.equal_nonAligned
can be replaced with a much simpler, more gas efficient and purely Solidity (YUL probably won't bring any benefits here): return _preBytes.length == _postBytes.length && keccak256(_preBytes) == keccak256(_postBytes);
.
The idea for the library was to provide a different alternative to the hashing option. That is dependent on the hash implementation versus having computation done on each of the bytes directly. I don't think I'd ever mask a simple hash comparison behind a lib, too.
In addition, have you tested and made sure it is more gas-efficient?
The idea for the library was to provide a different alternative to the hashing option. That is dependent on the hash implementation versus having computation done on each of the bytes directly. I don't think I'd ever mask a simple hash comparison behind a lib, too.
That's interesting, what for and why? Is hashing insecure?
Regarding gas usage, hashing costs 30 gas + 6 per 32-byte word, there's nothing to benchmark really, hashing is probably the cheapest thing you can do with an array of bytes in memory.
That's interesting, what for and why? Is hashing insecure?
Not really, but we're just giving optionality, I guess. I'm not afraid the keccak opcode is badly implemented in Geth.
But that's just the gas usage of the opcode. What about changing the memory pointer, filling the memory in the scratch space, etc.? Because my code does everything in place, we don't even touch the 0x0 - 0x40 scratch space like the Solidity compiler does to hash stuff together. What if you want to hash something that is bigger than 32 bytes? You need to copy multiple times. I'd say that the bigger the array, the more efficient my function is compared to hashing stuff together.
This being said, I never benchmarked it, so you may be right. I'd love to see some data to believe the statement, though.
Keccak works in-place, on an arbitrary memory range, check https://www.evm.codes. The scratch space is used mostly for calculating the storage slot indexes, because it involves repeated hashing of pairs of 32-byte words taken from the stack.
It's your project, so I can't tell you what to do, you'll do whatever you choose. I don't even use your code as a dependency, so I won't see the benefits of any optimization. All I'm saying is that you should check out hashing because it's a fairly popular strategy for reducing gas usage. Plus you'll drop a lot of complex Yul code.
I just benchmarked it, and you're totally right. Leaving the data here for posterity.
This is for a 4-byte-long array comparison with Keccak:
This is for the same comparison with my function:
Trying longer lengths shows that my function actually scales just a smidge slightly worse than the Keccak comparison.
With this being said, if the lengths do not match, my function is O(1) and performs better than the Keccak counterpart in any case.
To be honest, I've done this such a long time ago that I don't even know what I was thinking now, about memory manipulation of the Keccak opcode. It probably had something to do with the storage slot calculations, like you mentioned. 🤷♂️
Although this is all true, I still don't think I should change this function. First, because if using the Keccak version is what people want to do, using it through a library makes no sense since it is a one-liner. Second, because I've made this lib such a long time ago (and it's still consumed by some major projects) that I'm sincerely afraid to change any implementations at this point (e.g., that's why I merged the PR for the non aligned comparison as a new function instead of adding to the previous one). Third, because I like how this still serves as a well-commented, teaching example of the inner workings of the EVM. Fourth, because it's another option. 😄
But I sincerely thank you for bringing this up! 🙏 At least now we have some comparisons and options. Also happy to keep this issue open so people notice it when arriving at the repo. 👍
With this being said, if the lengths do not match, my function is O(1) and performs better than the Keccak counterpart in any case.
From the very first proposed version in this issue hashing is done only if the lengths are different. Omitting this path doesn't yield meaningful results.
I couldn't believe it so I wrote a benchmark too, here are the results. Hashing has a predictable cost which is a useful property on blockchains, and is usually faster, often by a lot. The only edge case are arrays of over 256 bytes where the very first byte is different, for which the loop-based implementation has the O(1) cost, so at some point it gets better than hashing.
Source
pragma solidity ^0.8.24;
import {console, Test} from "forge-std/Test.sol";
contract EqualTest is Test {
function testBenchEqual() public view {
for(uint i = 1; i < 4096; i *= 2) {
_benchForLength(i);
}
}
function _benchForLength(uint256 length) internal view {
bytes memory data = new bytes(length);
for(uint256 i = 0; i < length; i++) {
data[i] = bytes1(uint8(block.prevrandao + i));
}
console.log("Same for length", length);
this.benchEqual(data, data);
bytes memory differentFirst = bytes.concat(data);
differentFirst[0] ^= 0xff;
console.log("Different first for length", length);
this.benchEqual(data, differentFirst);
bytes memory differentLast = bytes.concat(data);
differentLast[differentLast.length -1] ^= 0xff;
console.log("Different last for length", length);
this.benchEqual(data, differentLast);
bytes memory differentLength = bytes.concat(data, bytes1(0));
console.log("Different length for length", length);
this.benchEqual(data, differentLength);
console.log("-----------------------");
console.log("");
}
function benchEqual(bytes memory preBytes, bytes memory postBytes) external view {
uint256 gasEqual = gasleft();
bool equalResult = BytesLib.equal(preBytes, postBytes);
gasEqual -= gasleft();
uint256 gasEqualNonAligned = gasleft();
bool equalNonAlignedResult = BytesLib.equal_nonAligned(preBytes, postBytes);
gasEqualNonAligned -= gasleft();
uint256 gasEqualHash = gasleft();
bool equalHashResult = BytesLib.equalHash(preBytes, postBytes);
gasEqualHash -= gasleft();
if(equalResult != equalNonAlignedResult) console.log("DIFFERENT ALIGNED AND NON ALIGNED!");
assertEq(equalResult, equalHashResult, "Invalid hash result");
console.log("Gas equal: ", gasEqual);
console.log("Gas equal non aligned:", gasEqualNonAligned);
console.log("Gas equal hash :", gasEqualHash);
console.log("");
}
}
library BytesLib {
function equalHash(bytes memory _preBytes, bytes memory _postBytes) internal pure returns (bool) {
return _preBytes.length == _postBytes.length && keccak256(_preBytes) == keccak256(_postBytes);
}
function equal(bytes memory _preBytes, bytes memory _postBytes) internal pure returns (bool) {
bool success = true;
assembly {
let length := mload(_preBytes)
// if lengths don't match the arrays are not equal
switch eq(length, mload(_postBytes))
case 1 {
// cb is a circuit breaker in the for loop since there's
// no said feature for inline assembly loops
// cb = 1 - don't breaker
// cb = 0 - break
let cb := 1
let mc := add(_preBytes, 0x20)
let end := add(mc, length)
for {
let cc := add(_postBytes, 0x20)
// the next line is the loop condition:
// while(uint256(mc < end) + cb == 2)
} eq(add(lt(mc, end), cb), 2) {
mc := add(mc, 0x20)
cc := add(cc, 0x20)
} {
// if any of these checks fails then arrays are not equal
if iszero(eq(mload(mc), mload(cc))) {
// unsuccess:
success := 0
cb := 0
}
}
}
default {
// unsuccess:
success := 0
}
}
return success;
}
function equal_nonAligned(bytes memory _preBytes, bytes memory _postBytes) internal pure returns (bool) {
bool success = true;
assembly {
let length := mload(_preBytes)
// if lengths don't match the arrays are not equal
switch eq(length, mload(_postBytes))
case 1 {
// cb is a circuit breaker in the for loop since there's
// no said feature for inline assembly loops
// cb = 1 - don't breaker
// cb = 0 - break
let cb := 1
let endMinusWord := add(_preBytes, length)
let mc := add(_preBytes, 0x20)
let cc := add(_postBytes, 0x20)
for {
// the next line is the loop condition:
// while(uint256(mc < endWord) + cb == 2)
} eq(add(lt(mc, endMinusWord), cb), 2) {
mc := add(mc, 0x20)
cc := add(cc, 0x20)
} {
// if any of these checks fails then arrays are not equal
if iszero(eq(mload(mc), mload(cc))) {
// unsuccess:
success := 0
cb := 0
}
}
// Only if still successful
// For <1 word tail bytes
if gt(success, 0) {
// Get the remainder of length/32
// length % 32 = AND(length, 32 - 1)
let numTailBytes := and(length, 0x1f)
let mcRem := mload(mc)
let ccRem := mload(cc)
for {
let i := 0
// the next line is the loop condition:
// while(uint256(i < numTailBytes) + cb == 2)
} eq(add(lt(i, numTailBytes), cb), 2) {
i := add(i, 1)
} {
if iszero(eq(byte(i, mcRem), byte(i, ccRem))) {
// unsuccess:
success := 0
cb := 0
}
}
}
}
default {
// unsuccess:
success := 0
}
}
return success;
}
}
Results for via-ir=false and 200 optimizer runs
Same for length 1
Gas equal: 289
Gas equal non aligned: 368
Gas equal hash : 208
Different first for length 1
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different last for length 1
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different length for length 1
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 2
Gas equal: 289
Gas equal non aligned: 455
Gas equal hash : 208
Different first for length 2
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different last for length 2
Gas equal: 305
Gas equal non aligned: 471
Gas equal hash : 208
Different length for length 2
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 4
Gas equal: 289
Gas equal non aligned: 629
Gas equal hash : 208
Different first for length 4
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different last for length 4
Gas equal: 305
Gas equal non aligned: 645
Gas equal hash : 208
Different length for length 4
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 8
Gas equal: 289
Gas equal non aligned: 977
Gas equal hash : 208
Different first for length 8
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different last for length 8
Gas equal: 305
Gas equal non aligned: 993
Gas equal hash : 208
Different length for length 8
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 16
Gas equal: 289
Gas equal non aligned: 1673
Gas equal hash : 208
Different first for length 16
Gas equal: 305
Gas equal non aligned: 384
Gas equal hash : 208
Different last for length 16
Gas equal: 305
Gas equal non aligned: 1689
Gas equal hash : 208
Different length for length 16
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 32
Gas equal: 289
Gas equal non aligned: 281
Gas equal hash : 208
Different first for length 32
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 305
Gas equal non aligned: 281
Gas equal hash : 208
Different last for length 32
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 305
Gas equal non aligned: 281
Gas equal hash : 208
Different length for length 32
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 64
Gas equal: 382
Gas equal non aligned: 374
Gas equal hash : 220
Different first for length 64
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 220
Different last for length 64
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 398
Gas equal non aligned: 374
Gas equal hash : 220
Different length for length 64
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 128
Gas equal: 568
Gas equal non aligned: 560
Gas equal hash : 244
Different first for length 128
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 244
Different last for length 128
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 584
Gas equal non aligned: 560
Gas equal hash : 244
Different length for length 128
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 256
Gas equal: 940
Gas equal non aligned: 932
Gas equal hash : 292
Different first for length 256
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 292
Different last for length 256
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 956
Gas equal non aligned: 932
Gas equal hash : 292
Different length for length 256
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 512
Gas equal: 1684
Gas equal non aligned: 1676
Gas equal hash : 388
Different first for length 512
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 388
Different last for length 512
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 1700
Gas equal non aligned: 1676
Gas equal hash : 388
Different length for length 512
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 1024
Gas equal: 3172
Gas equal non aligned: 3164
Gas equal hash : 580
Different first for length 1024
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 580
Different last for length 1024
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 3188
Gas equal non aligned: 3164
Gas equal hash : 580
Different length for length 1024
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Same for length 2048
Gas equal: 6148
Gas equal non aligned: 6140
Gas equal hash : 964
Different first for length 2048
Gas equal: 305
Gas equal non aligned: 325
Gas equal hash : 964
Different last for length 2048
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 6164
Gas equal non aligned: 6140
Gas equal hash : 964
Different length for length 2048
Gas equal: 140
Gas equal non aligned: 140
Gas equal hash : 95
-----------------------
Results for via-ir=true and 200 optimizer runs
Same for length 1
Gas equal: 266
Gas equal non aligned: 369
Gas equal hash : 211
Different first for length 1
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different last for length 1
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different length for length 1
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 2
Gas equal: 266
Gas equal non aligned: 457
Gas equal hash : 211
Different first for length 2
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different last for length 2
Gas equal: 294
Gas equal non aligned: 485
Gas equal hash : 211
Different length for length 2
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 4
Gas equal: 266
Gas equal non aligned: 633
Gas equal hash : 211
Different first for length 4
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different last for length 4
Gas equal: 294
Gas equal non aligned: 661
Gas equal hash : 211
Different length for length 4
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 8
Gas equal: 266
Gas equal non aligned: 985
Gas equal hash : 211
Different first for length 8
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different last for length 8
Gas equal: 294
Gas equal non aligned: 1013
Gas equal hash : 211
Different length for length 8
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 16
Gas equal: 266
Gas equal non aligned: 1689
Gas equal hash : 211
Different first for length 16
Gas equal: 294
Gas equal non aligned: 397
Gas equal hash : 211
Different last for length 16
Gas equal: 294
Gas equal non aligned: 1717
Gas equal hash : 211
Different length for length 16
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 32
Gas equal: 266
Gas equal non aligned: 281
Gas equal hash : 211
Different first for length 32
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 294
Gas equal non aligned: 281
Gas equal hash : 211
Different last for length 32
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 294
Gas equal non aligned: 281
Gas equal hash : 211
Different length for length 32
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 64
Gas equal: 360
Gas equal non aligned: 387
Gas equal hash : 223
Different first for length 64
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 223
Different last for length 64
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 388
Gas equal non aligned: 387
Gas equal hash : 223
Different length for length 64
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 128
Gas equal: 548
Gas equal non aligned: 599
Gas equal hash : 247
Different first for length 128
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 247
Different last for length 128
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 576
Gas equal non aligned: 599
Gas equal hash : 247
Different length for length 128
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 256
Gas equal: 924
Gas equal non aligned: 1023
Gas equal hash : 295
Different first for length 256
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 295
Different last for length 256
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 952
Gas equal non aligned: 1023
Gas equal hash : 295
Different length for length 256
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 512
Gas equal: 1676
Gas equal non aligned: 1871
Gas equal hash : 391
Different first for length 512
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 391
Different last for length 512
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 1704
Gas equal non aligned: 1871
Gas equal hash : 391
Different length for length 512
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 1024
Gas equal: 3180
Gas equal non aligned: 3567
Gas equal hash : 583
Different first for length 1024
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 583
Different last for length 1024
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 3208
Gas equal non aligned: 3567
Gas equal hash : 583
Different length for length 1024
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
Same for length 2048
Gas equal: 6188
Gas equal non aligned: 6959
Gas equal hash : 967
Different first for length 2048
Gas equal: 294
Gas equal non aligned: 342
Gas equal hash : 967
Different last for length 2048
DIFFERENT ALIGNED AND NON ALIGNED!
Gas equal: 6216
Gas equal non aligned: 6959
Gas equal hash : 967
Different length for length 2048
Gas equal: 74
Gas equal non aligned: 115
Gas equal hash : 87
-----------------------
The runs marked as DIFFERENT ALIGNED AND NON ALIGNED!
have equal
and equal_nonAligned
yielding different results. I think that regardless of switching to hashing, you should completely replace the equal
function with equal_nonAlighed
, the used algorithm just returns invalid values. Its not even a good teaching example, it's a bug, maybe the auditors will learn a common pitfall, but that's all. Keeping it as it is doesn't help the downstream, it leaves buggy code unpatched and makes the obvious choice wrong, being correct should not be the special case.
I mean, the normal function works as long as you use vanilla Solidity or don't use Yul to mess with the memory pointer manually. That was always its intended use case. In fact its first version was reviewed together with the compiler team. I'm guessing only in situations where the array is non-aligned to 32-byte slots, you got the different values? If not, that actually is signalling a bug in one of the functions, and I'd love it if you could share the test code with me so I could debug.
I could consider changing the non-aligned function to become the main one, though, given we don't care about the gas savings.
Was I right about the above or should I be worried about a bug in the functions?
Also, please note this library is 7 years old and, again, I am terribly afraid of breaking shit downstream at this point. 🙏
share the test code with me so I could debug
It's the code I shared above, just run it with forge test -vv
and see which cases print DIFFERENT ALIGNED AND NON ALIGNED!
. It seems that it happens when the arrays lengths are a multiple of 32, and the only difference between them is the last byte. Unless via-ir
is used, then sometimes even a different first byte triggers the error.
Sorry! 😅🙈 Completely ignored the “Source” tab. I’ll take a look! 🙏
I found the bug. In equal_nonAligned
numTailBytes
should never be 0, for cases where it is, it should be set to 32. This is because endMinusWord
always leaves 1-32 unchecked bytes, never 0.
IMO this function is extremely overcomplicated, even forgetting about hashing for a moment. Instead of having a special routine for comparing the partial last word, you could just replace the initial let mc := add(_preBytes, 0x20)
with let mc := add(_preBytes, and(mload(_preBytes), 0x1F))
and replace endMinusWord
with the regular let end := add(_preBytes, add(length, 32))
. This way you will check part of the length word twice, but it won't affect the result. The cb
variable can also be dropped, it's always the same as success
.
My current span of attention for this being so thin was the reason why I merged this PR into a different function. I am actually happy I didn't merge or replace the normal equal function for this one like the contributor asked me to.
I need some proper time to review the bug and your proposal. But this is great, and I can't thank you enough for taking such a close look! 🙏