lodestar
lodestar copied to clipboard
feat: async shuffling refactor
** NOTE: Note ready for review, but want to trigger CI **
Motivation
Move calculation of next shuffling to async to get it off of critical path during epoch transition. There is a full second during epoch transition used to calculate the epochCtx.nextShuffling
and that can be moved to an async process. Refactored a few pieces of the EpochCache
to make this work and will continue this by creating a worker that moves this calculation to a worker thread. By using a worker thread that is tuned down with NICE we can interleave the long calculation into thread idle time which is ideal. To be continued...
Description
- Change how shufflings are built/cached. Original method was to build on the epochCtx and then to processState to move them to the ShufflingCache. Cleaned up that flow a bit to build/store the shufflings directly in the ShufflingCache
- Moved full shufflings off the ShufflingCache and stored only the pieces we were using directly (length of activeValidators and the epoch numbers)
- Move ShufflingCache from
beacon-node
tostate-transition
- Pass logger into
EpochCache
so its available for debugging issues with shuffling builds
Performance Report
✔️ no performance regression detected
🚀🚀 Significant benchmark improvement detected
Benchmark suite | Current: b6139610b73c2e5e949fa6dbc5d1bc56ea6ccc53 | Previous: adc0534782436ee45614968c090915f0724121e1 | Ratio |
---|---|---|---|
phase0 afterProcessEpoch - 250000 vs - 7PWei | 9.5258 ms/op | 112.14 ms/op | 0.08 |
Full benchmark results
Benchmark suite | Current: b6139610b73c2e5e949fa6dbc5d1bc56ea6ccc53 | Previous: adc0534782436ee45614968c090915f0724121e1 | Ratio |
---|---|---|---|
getPubkeys - index2pubkey - req 1000 vs - 250000 vc | 808.69 us/op | 792.50 us/op | 1.02 |
getPubkeys - validatorsArr - req 1000 vs - 250000 vc | 166.51 us/op | 83.873 us/op | 1.99 |
BLS verify - blst-native | 1.4338 ms/op | 1.3337 ms/op | 1.08 |
BLS verifyMultipleSignatures 3 - blst-native | 3.4845 ms/op | 2.7201 ms/op | 1.28 |
BLS verifyMultipleSignatures 8 - blst-native | 6.7507 ms/op | 6.0052 ms/op | 1.12 |
BLS verifyMultipleSignatures 32 - blst-native | 28.130 ms/op | 21.969 ms/op | 1.28 |
BLS verifyMultipleSignatures 64 - blst-native | 62.987 ms/op | 43.166 ms/op | 1.46 |
BLS verifyMultipleSignatures 128 - blst-native | 108.76 ms/op | 86.357 ms/op | 1.26 |
BLS deserializing 10000 signatures | 998.97 ms/op | 924.28 ms/op | 1.08 |
BLS deserializing 100000 signatures | 10.418 s/op | 9.4577 s/op | 1.10 |
BLS verifyMultipleSignatures - same message - 3 - blst-native | 1.4400 ms/op | 1.3320 ms/op | 1.08 |
BLS verifyMultipleSignatures - same message - 8 - blst-native | 1.6733 ms/op | 1.6456 ms/op | 1.02 |
BLS verifyMultipleSignatures - same message - 32 - blst-native | 2.4824 ms/op | 2.9228 ms/op | 0.85 |
BLS verifyMultipleSignatures - same message - 64 - blst-native | 3.7327 ms/op | 4.4102 ms/op | 0.85 |
BLS verifyMultipleSignatures - same message - 128 - blst-native | 6.1825 ms/op | 7.9726 ms/op | 0.78 |
BLS aggregatePubkeys 32 - blst-native | 28.600 us/op | 25.918 us/op | 1.10 |
BLS aggregatePubkeys 128 - blst-native | 109.62 us/op | 100.81 us/op | 1.09 |
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 | 111.68 ms/op | 67.440 ms/op | 1.66 |
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 | 110.87 ms/op | 63.667 ms/op | 1.74 |
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 | 61.930 ms/op | 36.440 ms/op | 1.70 |
getSlashingsAndExits - default max | 235.53 us/op | 203.78 us/op | 1.16 |
getSlashingsAndExits - 2k | 643.29 us/op | 651.26 us/op | 0.99 |
proposeBlockBody type=full, size=empty | 5.8215 ms/op | 5.3843 ms/op | 1.08 |
isKnown best case - 1 super set check | 404.00 ns/op | 379.00 ns/op | 1.07 |
isKnown normal case - 2 super set checks | 330.00 ns/op | 532.00 ns/op | 0.62 |
isKnown worse case - 16 super set checks | 326.00 ns/op | 599.00 ns/op | 0.54 |
CheckpointStateCache - add get delete | 6.8540 us/op | 7.6150 us/op | 0.90 |
validate api signedAggregateAndProof - struct | 2.8864 ms/op | 3.0116 ms/op | 0.96 |
validate gossip signedAggregateAndProof - struct | 2.8918 ms/op | 2.8203 ms/op | 1.03 |
validate gossip attestation - vc 640000 | 1.3730 ms/op | 1.3874 ms/op | 0.99 |
batch validate gossip attestation - vc 640000 - chunk 32 | 159.38 us/op | 168.47 us/op | 0.95 |
batch validate gossip attestation - vc 640000 - chunk 64 | 140.72 us/op | 146.72 us/op | 0.96 |
batch validate gossip attestation - vc 640000 - chunk 128 | 143.82 us/op | 141.39 us/op | 1.02 |
batch validate gossip attestation - vc 640000 - chunk 256 | 148.50 us/op | 130.31 us/op | 1.14 |
pickEth1Vote - no votes | 1.4879 ms/op | 1.1663 ms/op | 1.28 |
pickEth1Vote - max votes | 14.622 ms/op | 9.8931 ms/op | 1.48 |
pickEth1Vote - Eth1Data hashTreeRoot value x2048 | 22.139 ms/op | 16.455 ms/op | 1.35 |
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 | 27.896 ms/op | 23.089 ms/op | 1.21 |
pickEth1Vote - Eth1Data fastSerialize value x2048 | 759.74 us/op | 620.45 us/op | 1.22 |
pickEth1Vote - Eth1Data fastSerialize tree x2048 | 5.8967 ms/op | 4.3950 ms/op | 1.34 |
bytes32 toHexString | 785.00 ns/op | 532.00 ns/op | 1.48 |
bytes32 Buffer.toString(hex) | 349.00 ns/op | 295.00 ns/op | 1.18 |
bytes32 Buffer.toString(hex) from Uint8Array | 568.00 ns/op | 428.00 ns/op | 1.33 |
bytes32 Buffer.toString(hex) + 0x | 330.00 ns/op | 292.00 ns/op | 1.13 |
Object access 1 prop | 0.21300 ns/op | 0.16800 ns/op | 1.27 |
Map access 1 prop | 0.15800 ns/op | 0.15400 ns/op | 1.03 |
Object get x1000 | 8.1140 ns/op | 7.3500 ns/op | 1.10 |
Map get x1000 | 0.86400 ns/op | 0.76700 ns/op | 1.13 |
Object set x1000 | 64.244 ns/op | 52.256 ns/op | 1.23 |
Map set x1000 | 45.434 ns/op | 41.214 ns/op | 1.10 |
Return object 10000 times | 0.25190 ns/op | 0.24490 ns/op | 1.03 |
Throw Error 10000 times | 4.0930 us/op | 3.9163 us/op | 1.05 |
fastMsgIdFn sha256 / 200 bytes | 3.4930 us/op | 3.3970 us/op | 1.03 |
fastMsgIdFn h32 xxhash / 200 bytes | 379.00 ns/op | 317.00 ns/op | 1.20 |
fastMsgIdFn h64 xxhash / 200 bytes | 379.00 ns/op | 348.00 ns/op | 1.09 |
fastMsgIdFn sha256 / 1000 bytes | 11.863 us/op | 11.370 us/op | 1.04 |
fastMsgIdFn h32 xxhash / 1000 bytes | 497.00 ns/op | 417.00 ns/op | 1.19 |
fastMsgIdFn h64 xxhash / 1000 bytes | 483.00 ns/op | 458.00 ns/op | 1.05 |
fastMsgIdFn sha256 / 10000 bytes | 107.76 us/op | 104.97 us/op | 1.03 |
fastMsgIdFn h32 xxhash / 10000 bytes | 2.1350 us/op | 1.9730 us/op | 1.08 |
fastMsgIdFn h64 xxhash / 10000 bytes | 1.4660 us/op | 1.3830 us/op | 1.06 |
send data - 1000 256B messages | 22.046 ms/op | 19.901 ms/op | 1.11 |
send data - 1000 512B messages | 33.079 ms/op | 28.055 ms/op | 1.18 |
send data - 1000 1024B messages | 43.995 ms/op | 41.059 ms/op | 1.07 |
send data - 1000 1200B messages | 43.890 ms/op | 37.236 ms/op | 1.18 |
send data - 1000 2048B messages | 55.579 ms/op | 48.863 ms/op | 1.14 |
send data - 1000 4096B messages | 44.914 ms/op | 44.281 ms/op | 1.01 |
send data - 1000 16384B messages | 133.09 ms/op | 117.00 ms/op | 1.14 |
send data - 1000 65536B messages | 531.63 ms/op | 471.40 ms/op | 1.13 |
enrSubnets - fastDeserialize 64 bits | 1.5610 us/op | 1.3310 us/op | 1.17 |
enrSubnets - ssz BitVector 64 bits | 665.00 ns/op | 445.00 ns/op | 1.49 |
enrSubnets - fastDeserialize 4 bits | 263.00 ns/op | 196.00 ns/op | 1.34 |
enrSubnets - ssz BitVector 4 bits | 656.00 ns/op | 466.00 ns/op | 1.41 |
prioritizePeers score -10:0 att 32-0.1 sync 2-0 | 123.50 us/op | 104.86 us/op | 1.18 |
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 | 157.90 us/op | 132.87 us/op | 1.19 |
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 | 218.62 us/op | 175.69 us/op | 1.24 |
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 | 354.18 us/op | 297.62 us/op | 1.19 |
prioritizePeers score 0:0 att 64-1 sync 4-1 | 401.29 us/op | 368.83 us/op | 1.09 |
array of 16000 items push then shift | 1.8536 us/op | 1.6256 us/op | 1.14 |
LinkedList of 16000 items push then shift | 10.465 ns/op | 9.0490 ns/op | 1.16 |
array of 16000 items push then pop | 118.69 ns/op | 59.223 ns/op | 2.00 |
LinkedList of 16000 items push then pop | 9.5370 ns/op | 8.8570 ns/op | 1.08 |
array of 24000 items push then shift | 2.6889 us/op | 2.4041 us/op | 1.12 |
LinkedList of 24000 items push then shift | 10.393 ns/op | 8.8960 ns/op | 1.17 |
array of 24000 items push then pop | 165.55 ns/op | 114.00 ns/op | 1.45 |
LinkedList of 24000 items push then pop | 9.9040 ns/op | 8.7010 ns/op | 1.14 |
intersect bitArray bitLen 8 | 6.4680 ns/op | 5.7850 ns/op | 1.12 |
intersect array and set length 8 | 83.520 ns/op | 64.743 ns/op | 1.29 |
intersect bitArray bitLen 128 | 38.499 ns/op | 35.272 ns/op | 1.09 |
intersect array and set length 128 | 1.1950 us/op | 948.41 ns/op | 1.26 |
bitArray.getTrueBitIndexes() bitLen 128 | 1.8030 us/op | 1.5620 us/op | 1.15 |
bitArray.getTrueBitIndexes() bitLen 248 | 3.2730 us/op | 2.8920 us/op | 1.13 |
bitArray.getTrueBitIndexes() bitLen 512 | 6.5630 us/op | 5.2420 us/op | 1.25 |
Buffer.concat 32 items | 1.0480 us/op | 1.0970 us/op | 0.96 |
Uint8Array.set 32 items | 2.4320 us/op | 2.6580 us/op | 0.91 |
Set add up to 64 items then delete first | 5.4080 us/op | 4.3039 us/op | 1.26 |
OrderedSet add up to 64 items then delete first | 7.1814 us/op | 5.3629 us/op | 1.34 |
Set add up to 64 items then delete last | 5.7434 us/op | 4.5334 us/op | 1.27 |
OrderedSet add up to 64 items then delete last | 7.5478 us/op | 5.6186 us/op | 1.34 |
Set add up to 64 items then delete middle | 5.6896 us/op | 4.4868 us/op | 1.27 |
OrderedSet add up to 64 items then delete middle | 9.1165 us/op | 6.9857 us/op | 1.31 |
Set add up to 128 items then delete first | 11.788 us/op | 9.4838 us/op | 1.24 |
OrderedSet add up to 128 items then delete first | 16.257 us/op | 12.161 us/op | 1.34 |
Set add up to 128 items then delete last | 11.709 us/op | 9.1715 us/op | 1.28 |
OrderedSet add up to 128 items then delete last | 15.139 us/op | 11.275 us/op | 1.34 |
Set add up to 128 items then delete middle | 11.587 us/op | 9.0236 us/op | 1.28 |
OrderedSet add up to 128 items then delete middle | 21.629 us/op | 16.624 us/op | 1.30 |
Set add up to 256 items then delete first | 23.837 us/op | 18.709 us/op | 1.27 |
OrderedSet add up to 256 items then delete first | 32.559 us/op | 24.900 us/op | 1.31 |
Set add up to 256 items then delete last | 22.664 us/op | 17.899 us/op | 1.27 |
OrderedSet add up to 256 items then delete last | 31.402 us/op | 22.785 us/op | 1.38 |
Set add up to 256 items then delete middle | 22.998 us/op | 18.059 us/op | 1.27 |
OrderedSet add up to 256 items then delete middle | 55.379 us/op | 44.531 us/op | 1.24 |
transfer serialized Status (84 B) | 2.2230 us/op | 1.6270 us/op | 1.37 |
copy serialized Status (84 B) | 1.5360 us/op | 1.1970 us/op | 1.28 |
transfer serialized SignedVoluntaryExit (112 B) | 2.3030 us/op | 1.8140 us/op | 1.27 |
copy serialized SignedVoluntaryExit (112 B) | 1.5360 us/op | 1.2770 us/op | 1.20 |
transfer serialized ProposerSlashing (416 B) | 2.4800 us/op | 2.8560 us/op | 0.87 |
copy serialized ProposerSlashing (416 B) | 2.4210 us/op | 2.6770 us/op | 0.90 |
transfer serialized Attestation (485 B) | 3.1420 us/op | 2.6360 us/op | 1.19 |
copy serialized Attestation (485 B) | 2.5720 us/op | 2.4990 us/op | 1.03 |
transfer serialized AttesterSlashing (33232 B) | 3.7200 us/op | 2.4860 us/op | 1.50 |
copy serialized AttesterSlashing (33232 B) | 10.398 us/op | 6.3920 us/op | 1.63 |
transfer serialized Small SignedBeaconBlock (128000 B) | 4.8740 us/op | 2.8310 us/op | 1.72 |
copy serialized Small SignedBeaconBlock (128000 B) | 28.665 us/op | 15.038 us/op | 1.91 |
transfer serialized Avg SignedBeaconBlock (200000 B) | 5.1470 us/op | 3.3940 us/op | 1.52 |
copy serialized Avg SignedBeaconBlock (200000 B) | 43.891 us/op | 20.602 us/op | 2.13 |
transfer serialized BlobsSidecar (524380 B) | 5.2130 us/op | 3.4700 us/op | 1.50 |
copy serialized BlobsSidecar (524380 B) | 114.01 us/op | 120.55 us/op | 0.95 |
transfer serialized Big SignedBeaconBlock (1000000 B) | 5.4320 us/op | 3.0400 us/op | 1.79 |
copy serialized Big SignedBeaconBlock (1000000 B) | 236.26 us/op | 380.64 us/op | 0.62 |
pass gossip attestations to forkchoice per slot | 7.1293 ms/op | 3.7688 ms/op | 1.89 |
forkChoice updateHead vc 100000 bc 64 eq 0 | 763.00 us/op | 672.05 us/op | 1.14 |
forkChoice updateHead vc 600000 bc 64 eq 0 | 6.2779 ms/op | 4.0664 ms/op | 1.54 |
forkChoice updateHead vc 1000000 bc 64 eq 0 | 8.6348 ms/op | 6.8946 ms/op | 1.25 |
forkChoice updateHead vc 600000 bc 320 eq 0 | 4.8957 ms/op | 4.1540 ms/op | 1.18 |
forkChoice updateHead vc 600000 bc 1200 eq 0 | 5.1534 ms/op | 4.2837 ms/op | 1.20 |
forkChoice updateHead vc 600000 bc 7200 eq 0 | 6.1694 ms/op | 5.3699 ms/op | 1.15 |
forkChoice updateHead vc 600000 bc 64 eq 1000 | 12.005 ms/op | 10.915 ms/op | 1.10 |
forkChoice updateHead vc 600000 bc 64 eq 10000 | 13.295 ms/op | 11.636 ms/op | 1.14 |
forkChoice updateHead vc 600000 bc 64 eq 300000 | 22.280 ms/op | 15.467 ms/op | 1.44 |
computeDeltas 500000 validators 300 proto nodes | 6.8713 ms/op | 6.6073 ms/op | 1.04 |
computeDeltas 500000 validators 1200 proto nodes | 6.5748 ms/op | 6.3694 ms/op | 1.03 |
computeDeltas 500000 validators 7200 proto nodes | 6.5105 ms/op | 6.4834 ms/op | 1.00 |
computeDeltas 750000 validators 300 proto nodes | 10.230 ms/op | 9.7722 ms/op | 1.05 |
computeDeltas 750000 validators 1200 proto nodes | 9.7887 ms/op | 9.7933 ms/op | 1.00 |
computeDeltas 750000 validators 7200 proto nodes | 9.9933 ms/op | 9.7276 ms/op | 1.03 |
computeDeltas 1400000 validators 300 proto nodes | 19.397 ms/op | 17.968 ms/op | 1.08 |
computeDeltas 1400000 validators 1200 proto nodes | 19.377 ms/op | 17.844 ms/op | 1.09 |
computeDeltas 1400000 validators 7200 proto nodes | 19.628 ms/op | 17.858 ms/op | 1.10 |
computeDeltas 2100000 validators 300 proto nodes | 29.726 ms/op | 26.869 ms/op | 1.11 |
computeDeltas 2100000 validators 1200 proto nodes | 28.949 ms/op | 27.255 ms/op | 1.06 |
computeDeltas 2100000 validators 7200 proto nodes | 29.698 ms/op | 26.350 ms/op | 1.13 |
altair processAttestation - 250000 vs - 7PWei normalcase | 2.7614 ms/op | 2.9283 ms/op | 0.94 |
altair processAttestation - 250000 vs - 7PWei worstcase | 3.7242 ms/op | 4.0120 ms/op | 0.93 |
altair processAttestation - setStatus - 1/6 committees join | 160.96 us/op | 213.67 us/op | 0.75 |
altair processAttestation - setStatus - 1/3 committees join | 304.91 us/op | 429.12 us/op | 0.71 |
altair processAttestation - setStatus - 1/2 committees join | 411.92 us/op | 581.93 us/op | 0.71 |
altair processAttestation - setStatus - 2/3 committees join | 512.90 us/op | 652.26 us/op | 0.79 |
altair processAttestation - setStatus - 4/5 committees join | 723.44 us/op | 995.01 us/op | 0.73 |
altair processAttestation - setStatus - 100% committees join | 859.50 us/op | 1.1058 ms/op | 0.78 |
altair processBlock - 250000 vs - 7PWei normalcase | 10.692 ms/op | 7.9389 ms/op | 1.35 |
altair processBlock - 250000 vs - 7PWei normalcase hashState | 36.413 ms/op | 34.314 ms/op | 1.06 |
altair processBlock - 250000 vs - 7PWei worstcase | 41.037 ms/op | 38.807 ms/op | 1.06 |
altair processBlock - 250000 vs - 7PWei worstcase hashState | 119.33 ms/op | 90.515 ms/op | 1.32 |
phase0 processBlock - 250000 vs - 7PWei normalcase | 3.2677 ms/op | 2.8609 ms/op | 1.14 |
phase0 processBlock - 250000 vs - 7PWei worstcase | 34.232 ms/op | 28.893 ms/op | 1.18 |
altair processEth1Data - 250000 vs - 7PWei normalcase | 709.33 us/op | 476.37 us/op | 1.49 |
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 | 16.786 us/op | 7.4280 us/op | 2.26 |
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 | 66.303 us/op | 32.848 us/op | 2.02 |
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 | 28.461 us/op | 10.765 us/op | 2.64 |
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 | 20.178 us/op | 10.203 us/op | 1.98 |
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 | 213.33 us/op | 119.64 us/op | 1.78 |
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 | 1.6168 ms/op | 1.0326 ms/op | 1.57 |
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 | 2.3482 ms/op | 1.4912 ms/op | 1.57 |
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 | 1.9789 ms/op | 1.5262 ms/op | 1.30 |
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 | 4.4760 ms/op | 3.3999 ms/op | 1.32 |
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 | 3.1252 ms/op | 2.3292 ms/op | 1.34 |
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 | 7.6084 ms/op | 5.2056 ms/op | 1.46 |
Tree 40 250000 create | 406.37 ms/op | 343.02 ms/op | 1.18 |
Tree 40 250000 get(125000) | 219.00 ns/op | 193.49 ns/op | 1.13 |
Tree 40 250000 set(125000) | 1.0892 us/op | 1.0295 us/op | 1.06 |
Tree 40 250000 toArray() | 23.353 ms/op | 20.186 ms/op | 1.16 |
Tree 40 250000 iterate all - toArray() + loop | 25.100 ms/op | 17.659 ms/op | 1.42 |
Tree 40 250000 iterate all - get(i) | 77.552 ms/op | 64.553 ms/op | 1.20 |
MutableVector 250000 create | 17.905 ms/op | 12.070 ms/op | 1.48 |
MutableVector 250000 get(125000) | 6.9500 ns/op | 6.3850 ns/op | 1.09 |
MutableVector 250000 set(125000) | 320.65 ns/op | 250.73 ns/op | 1.28 |
MutableVector 250000 toArray() | 4.0278 ms/op | 2.7717 ms/op | 1.45 |
MutableVector 250000 iterate all - toArray() + loop | 4.1412 ms/op | 2.8871 ms/op | 1.43 |
MutableVector 250000 iterate all - get(i) | 1.6019 ms/op | 1.5245 ms/op | 1.05 |
Array 250000 create | 3.6988 ms/op | 2.5386 ms/op | 1.46 |
Array 250000 clone - spread | 1.4953 ms/op | 1.1837 ms/op | 1.26 |
Array 250000 get(125000) | 1.2350 ns/op | 1.0230 ns/op | 1.21 |
Array 250000 set(125000) | 5.2550 ns/op | 4.0410 ns/op | 1.30 |
Array 250000 iterate all - loop | 173.50 us/op | 165.44 us/op | 1.05 |
effectiveBalanceIncrements clone Uint8Array 300000 | 42.482 us/op | 28.045 us/op | 1.51 |
effectiveBalanceIncrements clone MutableVector 300000 | 455.00 ns/op | 360.00 ns/op | 1.26 |
effectiveBalanceIncrements rw all Uint8Array 300000 | 208.67 us/op | 199.10 us/op | 1.05 |
effectiveBalanceIncrements rw all MutableVector 300000 | 101.14 ms/op | 81.252 ms/op | 1.24 |
phase0 afterProcessEpoch - 250000 vs - 7PWei | 9.5258 ms/op | 112.14 ms/op | 0.08 |
phase0 beforeProcessEpoch - 250000 vs - 7PWei | 42.113 ms/op | 50.768 ms/op | 0.83 |
altair processEpoch - mainnet_e81889 | 394.36 ms/op | 484.06 ms/op | 0.81 |
mainnet_e81889 - altair beforeProcessEpoch | 82.212 ms/op | 81.149 ms/op | 1.01 |
mainnet_e81889 - altair processJustificationAndFinalization | 23.770 us/op | 15.167 us/op | 1.57 |
mainnet_e81889 - altair processInactivityUpdates | 5.8206 ms/op | 5.6592 ms/op | 1.03 |
mainnet_e81889 - altair processRewardsAndPenalties | 63.721 ms/op | 39.039 ms/op | 1.63 |
mainnet_e81889 - altair processRegistryUpdates | 2.6900 us/op | 2.3670 us/op | 1.14 |
mainnet_e81889 - altair processSlashings | 505.00 ns/op | 490.00 ns/op | 1.03 |
mainnet_e81889 - altair processEth1DataReset | 603.00 ns/op | 467.00 ns/op | 1.29 |
mainnet_e81889 - altair processEffectiveBalanceUpdates | 2.0540 ms/op | 1.4377 ms/op | 1.43 |
mainnet_e81889 - altair processSlashingsReset | 7.1900 us/op | 3.3510 us/op | 2.15 |
mainnet_e81889 - altair processRandaoMixesReset | 7.3850 us/op | 4.6060 us/op | 1.60 |
mainnet_e81889 - altair processHistoricalRootsUpdate | 1.7070 us/op | 675.00 ns/op | 2.53 |
mainnet_e81889 - altair processParticipationFlagUpdates | 2.7490 us/op | 3.2150 us/op | 0.86 |
mainnet_e81889 - altair processSyncCommitteeUpdates | 594.00 ns/op | 668.00 ns/op | 0.89 |
mainnet_e81889 - altair afterProcessEpoch | 9.1597 ms/op | 115.69 ms/op | 0.08 |
capella processEpoch - mainnet_e217614 | 1.8865 s/op | 1.7714 s/op | 1.06 |
mainnet_e217614 - capella beforeProcessEpoch | 480.08 ms/op | 452.12 ms/op | 1.06 |
mainnet_e217614 - capella processJustificationAndFinalization | 19.675 us/op | 17.060 us/op | 1.15 |
mainnet_e217614 - capella processInactivityUpdates | 22.381 ms/op | 22.958 ms/op | 0.97 |
mainnet_e217614 - capella processRewardsAndPenalties | 467.38 ms/op | 476.85 ms/op | 0.98 |
mainnet_e217614 - capella processRegistryUpdates | 42.298 us/op | 22.317 us/op | 1.90 |
mainnet_e217614 - capella processSlashings | 1.0850 us/op | 451.00 ns/op | 2.41 |
mainnet_e217614 - capella processEth1DataReset | 775.00 ns/op | 537.00 ns/op | 1.44 |
mainnet_e217614 - capella processEffectiveBalanceUpdates | 4.7199 ms/op | 5.4559 ms/op | 0.87 |
mainnet_e217614 - capella processSlashingsReset | 6.5550 us/op | 3.2850 us/op | 2.00 |
mainnet_e217614 - capella processRandaoMixesReset | 7.9070 us/op | 5.2710 us/op | 1.50 |
mainnet_e217614 - capella processHistoricalRootsUpdate | 1.2360 us/op | 602.00 ns/op | 2.05 |
mainnet_e217614 - capella processParticipationFlagUpdates | 2.2410 us/op | 4.5950 us/op | 0.49 |
mainnet_e217614 - capella afterProcessEpoch | 8.5972 ms/op | 307.65 ms/op | 0.03 |
phase0 processEpoch - mainnet_e58758 | 444.59 ms/op | 516.91 ms/op | 0.86 |
mainnet_e58758 - phase0 beforeProcessEpoch | 137.45 ms/op | 144.26 ms/op | 0.95 |
mainnet_e58758 - phase0 processJustificationAndFinalization | 25.371 us/op | 16.297 us/op | 1.56 |
mainnet_e58758 - phase0 processRewardsAndPenalties | 64.070 ms/op | 53.819 ms/op | 1.19 |
mainnet_e58758 - phase0 processRegistryUpdates | 15.836 us/op | 9.7940 us/op | 1.62 |
mainnet_e58758 - phase0 processSlashings | 654.00 ns/op | 635.00 ns/op | 1.03 |
mainnet_e58758 - phase0 processEth1DataReset | 691.00 ns/op | 816.00 ns/op | 0.85 |
mainnet_e58758 - phase0 processEffectiveBalanceUpdates | 2.1403 ms/op | 1.1929 ms/op | 1.79 |
mainnet_e58758 - phase0 processSlashingsReset | 4.1980 us/op | 2.4360 us/op | 1.72 |
mainnet_e58758 - phase0 processRandaoMixesReset | 6.2250 us/op | 4.2400 us/op | 1.47 |
mainnet_e58758 - phase0 processHistoricalRootsUpdate | 659.00 ns/op | 606.00 ns/op | 1.09 |
mainnet_e58758 - phase0 processParticipationRecordUpdates | 6.0560 us/op | 4.9480 us/op | 1.22 |
mainnet_e58758 - phase0 afterProcessEpoch | 8.3624 ms/op | 101.74 ms/op | 0.08 |
phase0 processEffectiveBalanceUpdates - 250000 normalcase | 2.5963 ms/op | 1.3767 ms/op | 1.89 |
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 | 1.4664 ms/op | 1.5362 ms/op | 0.95 |
altair processInactivityUpdates - 250000 normalcase | 34.293 ms/op | 32.523 ms/op | 1.05 |
altair processInactivityUpdates - 250000 worstcase | 34.463 ms/op | 24.252 ms/op | 1.42 |
phase0 processRegistryUpdates - 250000 normalcase | 12.870 us/op | 14.537 us/op | 0.89 |
phase0 processRegistryUpdates - 250000 badcase_full_deposits | 630.07 us/op | 462.05 us/op | 1.36 |
phase0 processRegistryUpdates - 250000 worstcase 0.5 | 144.28 ms/op | 142.86 ms/op | 1.01 |
altair processRewardsAndPenalties - 250000 normalcase | 67.482 ms/op | 65.509 ms/op | 1.03 |
altair processRewardsAndPenalties - 250000 worstcase | 67.273 ms/op | 61.436 ms/op | 1.10 |
phase0 getAttestationDeltas - 250000 normalcase | 9.0912 ms/op | 10.792 ms/op | 0.84 |
phase0 getAttestationDeltas - 250000 worstcase | 8.8824 ms/op | 9.9359 ms/op | 0.89 |
phase0 processSlashings - 250000 worstcase | 131.90 us/op | 97.945 us/op | 1.35 |
altair processSyncCommitteeUpdates - 250000 | 149.24 ms/op | 161.07 ms/op | 0.93 |
BeaconState.hashTreeRoot - No change | 247.00 ns/op | 371.00 ns/op | 0.67 |
BeaconState.hashTreeRoot - 1 full validator | 147.89 us/op | 122.53 us/op | 1.21 |
BeaconState.hashTreeRoot - 32 full validator | 1.6378 ms/op | 1.1514 ms/op | 1.42 |
BeaconState.hashTreeRoot - 512 full validator | 16.635 ms/op | 14.558 ms/op | 1.14 |
BeaconState.hashTreeRoot - 1 validator.effectiveBalance | 153.47 us/op | 183.32 us/op | 0.84 |
BeaconState.hashTreeRoot - 32 validator.effectiveBalance | 2.2762 ms/op | 2.1488 ms/op | 1.06 |
BeaconState.hashTreeRoot - 512 validator.effectiveBalance | 35.254 ms/op | 33.190 ms/op | 1.06 |
BeaconState.hashTreeRoot - 1 balances | 146.39 us/op | 135.09 us/op | 1.08 |
BeaconState.hashTreeRoot - 32 balances | 1.2142 ms/op | 1.2129 ms/op | 1.00 |
BeaconState.hashTreeRoot - 512 balances | 13.828 ms/op | 13.793 ms/op | 1.00 |
BeaconState.hashTreeRoot - 250000 balances | 227.79 ms/op | 225.18 ms/op | 1.01 |
aggregationBits - 2048 els - zipIndexesInBitList | 25.700 us/op | 70.920 us/op | 0.36 |
byteArrayEquals 32 | 74.486 ns/op | 75.090 ns/op | 0.99 |
Buffer.compare 32 | 55.504 ns/op | 55.817 ns/op | 0.99 |
byteArrayEquals 1024 | 2.0428 us/op | 2.0457 us/op | 1.00 |
Buffer.compare 1024 | 72.694 ns/op | 70.502 ns/op | 1.03 |
byteArrayEquals 16384 | 32.563 us/op | 32.557 us/op | 1.00 |
Buffer.compare 16384 | 252.88 ns/op | 270.28 ns/op | 0.94 |
byteArrayEquals 123687377 | 242.78 ms/op | 252.72 ms/op | 0.96 |
Buffer.compare 123687377 | 6.3092 ms/op | 8.5285 ms/op | 0.74 |
byteArrayEquals 32 - diff last byte | 72.437 ns/op | 74.156 ns/op | 0.98 |
Buffer.compare 32 - diff last byte | 56.371 ns/op | 57.229 ns/op | 0.99 |
byteArrayEquals 1024 - diff last byte | 2.0634 us/op | 2.6518 us/op | 0.78 |
Buffer.compare 1024 - diff last byte | 73.310 ns/op | 81.031 ns/op | 0.90 |
byteArrayEquals 16384 - diff last byte | 33.345 us/op | 33.825 us/op | 0.99 |
Buffer.compare 16384 - diff last byte | 281.38 ns/op | 254.75 ns/op | 1.10 |
byteArrayEquals 123687377 - diff last byte | 249.42 ms/op | 257.30 ms/op | 0.97 |
Buffer.compare 123687377 - diff last byte | 6.8335 ms/op | 6.9196 ms/op | 0.99 |
byteArrayEquals 32 - random bytes | 5.5360 ns/op | 5.3910 ns/op | 1.03 |
Buffer.compare 32 - random bytes | 62.638 ns/op | 62.462 ns/op | 1.00 |
byteArrayEquals 1024 - random bytes | 5.2600 ns/op | 5.2140 ns/op | 1.01 |
Buffer.compare 1024 - random bytes | 61.120 ns/op | 60.655 ns/op | 1.01 |
byteArrayEquals 16384 - random bytes | 5.2440 ns/op | 5.1710 ns/op | 1.01 |
Buffer.compare 16384 - random bytes | 62.760 ns/op | 60.324 ns/op | 1.04 |
byteArrayEquals 123687377 - random bytes | 8.6000 ns/op | 8.4300 ns/op | 1.02 |
Buffer.compare 123687377 - random bytes | 67.050 ns/op | 63.500 ns/op | 1.06 |
regular array get 100000 times | 45.530 us/op | 43.936 us/op | 1.04 |
wrappedArray get 100000 times | 45.099 us/op | 44.778 us/op | 1.01 |
arrayWithProxy get 100000 times | 15.670 ms/op | 14.936 ms/op | 1.05 |
ssz.Root.equals | 55.199 ns/op | 54.392 ns/op | 1.01 |
byteArrayEquals | 54.358 ns/op | 54.348 ns/op | 1.00 |
Buffer.compare | 11.047 ns/op | 11.401 ns/op | 0.97 |
shuffle list - 16384 els | 8.6635 ms/op | 8.6133 ms/op | 1.01 |
shuffle list - 250000 els | 130.68 ms/op | 124.99 ms/op | 1.05 |
processSlot - 1 slots | 16.520 us/op | 17.420 us/op | 0.95 |
processSlot - 32 slots | 4.1946 ms/op | 3.3153 ms/op | 1.27 |
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei | 64.553 ms/op | 58.732 ms/op | 1.10 |
getCommitteeAssignments - req 1 vs - 250000 vc | 2.7080 ms/op | 2.6537 ms/op | 1.02 |
getCommitteeAssignments - req 100 vs - 250000 vc | 3.9087 ms/op | 3.8348 ms/op | 1.02 |
getCommitteeAssignments - req 1000 vs - 250000 vc | 4.2717 ms/op | 4.1876 ms/op | 1.02 |
findModifiedValidators - 10000 modified validators | 520.55 ms/op | 556.14 ms/op | 0.94 |
findModifiedValidators - 1000 modified validators | 426.48 ms/op | 385.66 ms/op | 1.11 |
findModifiedValidators - 100 modified validators | 395.75 ms/op | 415.96 ms/op | 0.95 |
findModifiedValidators - 10 modified validators | 395.42 ms/op | 394.53 ms/op | 1.00 |
findModifiedValidators - 1 modified validators | 415.29 ms/op | 399.70 ms/op | 1.04 |
findModifiedValidators - no difference | 412.30 ms/op | 410.56 ms/op | 1.00 |
compare ViewDUs | 4.9412 s/op | 4.2832 s/op | 1.15 |
compare each validator Uint8Array | 1.7897 s/op | 1.5276 s/op | 1.17 |
compare ViewDU to Uint8Array | 1.4221 s/op | 1.0780 s/op | 1.32 |
migrate state 1000000 validators, 24 modified, 0 new | 883.00 ms/op | 787.74 ms/op | 1.12 |
migrate state 1000000 validators, 1700 modified, 1000 new | 1.1843 s/op | 1.0623 s/op | 1.11 |
migrate state 1000000 validators, 3400 modified, 2000 new | 1.5765 s/op | 1.2952 s/op | 1.22 |
migrate state 1500000 validators, 24 modified, 0 new | 1.0120 s/op | 776.07 ms/op | 1.30 |
migrate state 1500000 validators, 1700 modified, 1000 new | 1.2797 s/op | 1.0834 s/op | 1.18 |
migrate state 1500000 validators, 3400 modified, 2000 new | 1.6850 s/op | 1.3105 s/op | 1.29 |
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei | 5.5100 ns/op | 4.2100 ns/op | 1.31 |
state getBlockRootAtSlot - 250000 vs - 7PWei | 791.59 ns/op | 615.59 ns/op | 1.29 |
computeProposers - vc 250000 | 11.234 ms/op | 8.6634 ms/op | 1.30 |
computeEpochShuffling - vc 250000 | 141.61 ms/op | 122.81 ms/op | 1.15 |
getNextSyncCommittee - vc 250000 | 178.07 ms/op | 159.66 ms/op | 1.12 |
computeSigningRoot for AttestationData | 31.590 us/op | 28.031 us/op | 1.13 |
hash AttestationData serialized data then Buffer.toString(base64) | 2.5710 us/op | 2.2450 us/op | 1.15 |
toHexString serialized data | 1.6991 us/op | 1.0674 us/op | 1.59 |
Buffer.toString(base64) | 289.51 ns/op | 212.93 ns/op | 1.36 |
by benchmarkbot/action
this PR is not aligned with the high level design stated in in #6386 where it's recommended to move shuffling from state-transition
to beacon-node
. Some benefits of that approach:
-
beacon-node
is the consumer of shuffling, it should just use the current ShufflingCache there, enhance if needed - we want to keep state-transition simple with no
async/await
- also it's more convenient to implement offloading next shuffling computation in
beacon-node
, there's already a couple of worker implementations there