besu icon indicating copy to clipboard operation
besu copied to clipboard

[DO NOT MERGE] feat: EIP-7939 Implement counting leading zero operation

Open Gabriel-Trintinalia opened this issue 5 months ago • 5 comments

Implements https://eips.ethereum.org/EIPS/eip-7939

This pull request adds support for the EIP-7939 CLZ opcode, implementing the operation itself, adding corresponding tests, and properly registering the new opcode in the mainnet EVM operations.

Introduces CountLeadingZerosOperation which computes the number of leading zeros for a given input. Provides parameterized tests to verify correct opcode functionality. Updates MainnetEVMs to register the new CLZ operation.

Gabriel-Trintinalia avatar Jun 09 '25 15:06 Gabriel-Trintinalia

Please add an end to end test for this opcode to make sure it fully works for the fork. Suggestion: look into EvmToolSpecTests

@lu-pinto done

Gabriel-Trintinalia avatar Jun 19 '25 01:06 Gabriel-Trintinalia

Please don't merge before performance benchmarking.

ahamlat avatar Jun 19 '25 10:06 ahamlat

@ahamlat @lu-pinto any updates on the benchmarking? Is there any reason why we couldn't merge it as it is and revisit it when the benchmark is done?

Gabriel-Trintinalia avatar Jun 25 '25 02:06 Gabriel-Trintinalia

So I ran some benchmarks for both cases (I actually had to change the suggestion implementation slightly to make it work):

      int numberOfLeadingZeros = value.numberOfLeadingZeros() + Bytes32.SIZE - value.size();

Benchmark results show 2-3x slower for the impl in this PR:

Benchmark                                                                                                     (bytesHex)  (useBase32Wrapping)  Mode  Cnt   Score   Error  Units
CountLeadingZerosOperationBenchmark.executeOperation                                                                0x23                 true  avgt   10  14.550 ± 0.017  ms/op
CountLeadingZerosOperationBenchmark.executeOperation                                                                0x23                false  avgt   10  48.058 ± 0.114  ms/op
CountLeadingZerosOperationBenchmark.executeOperation  0x2323232323232323232323232323232323232323232323232323232323232323                 true  avgt   10  15.090 ± 0.068  ms/op
CountLeadingZerosOperationBenchmark.executeOperation  0x2323232323232323232323232323232323232323232323232323232323232323                false  avgt   10  26.924 ± 0.104  ms/op
CountLeadingZerosOperationBenchmark.executeOperation                                    0x232323232323232323232323232323                 true  avgt   10  14.553 ± 0.019  ms/op
CountLeadingZerosOperationBenchmark.executeOperation                                    0x232323232323232323232323232323                false  avgt   10  47.515 ± 0.207  ms/op

branch where I got my benchmarks published: lu-pinto/benchmark-8770-clz-opcode

lu-pinto avatar Jun 25 '25 15:06 lu-pinto

We can even simplify the code and avoid the check. Instead of

numberOfLeadingZeros = value.numberOfLeadingZeros();
      if (value.size() != Bytes32.SIZE) {
        numberOfLeadingZeros += Bytes32.SIZE - value.size();
      }

we can have, as in the case where the check is false, value.size() is called twice, and we already that this call can be time consuming

numberOfLeadingZeros = value.numberOfLeadingZeros() +  Bytes32.SIZE - value.size();

ahamlat avatar Jun 26 '25 14:06 ahamlat

@lu-pinto @ahamlat what are the next steps here?

Gabriel-Trintinalia avatar Jul 03 '25 03:07 Gabriel-Trintinalia

The benchmarks show that when useBase32Wrapping is true, the results is much better in terms of performance. Which means that the fast implementation is this one

 numberOfLeadingZeros = value.numberOfLeadingZeros();
      if (value.size() != Bytes32.SIZE) {
        numberOfLeadingZeros += Bytes32.SIZE - value.size();
      }

Could you replace with this implementation ? remove DO NOT MERGE and make it ready for review ?

ahamlat avatar Jul 03 '25 13:07 ahamlat

@ahamlat @lu-pinto

The benchmarks show that when useBase32Wrapping is true, the results is much better in terms of performance. Which means that the fast implementation is this one

 numberOfLeadingZeros = value.numberOfLeadingZeros();
      if (value.size() != Bytes32.SIZE) {
        numberOfLeadingZeros += Bytes32.SIZE - value.size();
      }

Could you replace with this implementation ? remove DO NOT MERGE and make it ready for review ?

Unfortunately, It does not pass the reference tests. I added a failing test to the PR:

Arguments.of("0xff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff", 8));

The proposed change above returns 1.

Screenshot 2025-07-04 at 1 50 12 pm

Gabriel-Trintinalia avatar Jul 04 '25 04:07 Gabriel-Trintinalia

@ahamlat @lu-pinto

The benchmarks show that when useBase32Wrapping is true, the results is much better in terms of performance. Which means that the fast implementation is this one

 numberOfLeadingZeros = value.numberOfLeadingZeros();
      if (value.size() != Bytes32.SIZE) {
        numberOfLeadingZeros += Bytes32.SIZE - value.size();
      }

Could you replace with this implementation ? remove DO NOT MERGE and make it ready for review ?

Unfortunately, It does not pass the reference tests. I added a failing test to the PR:

Arguments.of("0xff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff00ff", 8));

The proposed change above returns 1.

Screenshot 2025-07-04 at 1 50 12 pm

Fixed on the new tip https://github.com/hyperledger/besu/commit/5607b1c070abe52366d1ec180fbe7a0e9fa34faf. Was partially counting bytes instead of bits :)

lu-pinto avatar Jul 04 '25 09:07 lu-pinto

We had a chat about it with @lu-pinto . There's one main issue and the unit test is made in a way to reproduce the reference test :

  1. The main issue is :
 numberOfLeadingZeros += Bytes32.SIZE - value.size();

This should be

numberOfLeadingZeros += 8 * (Bytes32.SIZE - value.size());
  1. The second one is not really an issue, but it helped to show the issue in the unit test Instead of
Bytes input = Bytes.fromHexString(value);

You can use

Bytes input = Bytes32.fromHexString(value);

But may be we should keep using Bytes instead just to be sure we're handling correctly this case.

ahamlat avatar Jul 04 '25 09:07 ahamlat