nimcrypto
nimcrypto copied to clipboard
Optimized SHA2 implementation.
Should address #36
A few benchmarks for hashing the beacon state - this is certainly not a exhaustive benchmark because it only tests 64-byte values, but it's still indicative on that particular sample size:
11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
# best of 3 runs of each
# current nimcrypto
arnetheduck@praeceps:~/status/nimbus-eth2$ ncli/ncli --print-times hashTreeRoot deneb_state state.ssz
683ed74f8fb7f3322e2b746796d22c1a03023e0aa82299b536f66598bc928407
All time are ms
Average, StdDev, Min, Max, Samples, Test
1293.098, 0.000, 1293.098, 1293.098, 1, Load file
6333.745, 0.000, 6333.745, 6333.745, 1, Compute
# new-sha2 with reference implementation - slightly slower
Average, StdDev, Min, Max, Samples, Test
1328.006, 0.000, 1328.006, 1328.006, 1, Load file
6780.952, 0.000, 6780.952, 6780.952, 1, Compute
# new-sha2 with cpuid with `shaext` implementation, cpu detection for every new context
Average, StdDev, Min, Max, Samples, Test
1156.908, 0.000, 1156.908, 1156.908, 1, Load file
4662.638, 0.000, 4662.638, 4662.638, 1, Compute
# new-sha2 with hardcoded `shaext` implementation,
Average, StdDev, Min, Max, Samples, Test
714.325, 0.000, 714.325, 714.325, 1, Load file
1512.727, 0.000, 1512.727, 1512.727, 1, Compute
# new-sha2 with hardcoded `avx2`
Average, StdDev, Min, Max, Samples, Test
1250.886, 0.000, 1250.886, 1250.886, 1, Load file
5794.621, 0.000, 5794.621, 5794.621, 1, Compute
# new-sha2 with hardcoded `avx` - oddly, this one is a bit faster than avx2
Average, StdDev, Min, Max, Samples, Test
1225.362, 0.000, 1225.362, 1225.362, 1, Load file
5662.962, 0.000, 5662.962, 5662.962, 1, Compute
# blst
Average, StdDev, Min, Max, Samples, Test
747.602, 0.000, 747.602, 747.602, 1, Load file
1581.679, 0.000, 1581.679, 1581.679, 1, Compute
AVX is faster than AVX2 because of data size... AVX2 implementation uses AVX implementation for 64 bytes data.
Note that you can bench also vs Constantine which includes OpenSSL
git clone https://github.com/mratsim/constantine
cd constantine
CC=clang nimble bench_sha256
includes OpenSSL
from what I remember, openssl == blst more or less