Port `simd_2x64` to SIMD128 and make use of it in `ghash_vperm`
This is a follow-up to https://github.com/randombit/botan/pull/5155, extending the library's support for Wasm SIMD128. This time the simd_2x64 and ghash_vperm modules were ported. I had some difficulty settling on good names for a few SIMD_2x64 functions (like reverse_all_bytes), trying to be as unambiguous as possible - feedback appreciated. The simd_2x64 tests are essentially a one-to-one adaptation of the existing simd_4x32 tests (with simd_2x64-specific additions).
And some numbers (AMD Ryzen 7 4800H, V8 in Chrome 142, buffer size 4096):
| Before #5155 (2d01ad0cedd1d5ae6c00ce372942df5c5a029e8a) | After #5155 (884e1bac17b587ad12475a32745229ba80a00435) | This PR (ee6e281a02103b12e11767f21587fc3cfd91f455) | |
|---|---|---|---|
| AES-128/GCM(16) encrypt (MiB/s) | 52.912 | 100.558 | 145.069 |
| AES-128/GCM(16) decrypt (MiB/s) | 53.658 | 100.948 | 144.288 |
| AES-256/GCM(16) encrypt (MiB/s) | 44.044 | 92.388 | 125.686 |
| AES-256/GCM(16) decrypt (MiB/s) | 44.071 | 92.815 | 125.869 |
coverage: 90.63% (+0.007%) from 90.623% when pulling a4eb7e00efe03fd63946714d69da54887c2a2ce6 on polarnis:wasm-simd128-part-2 into f8eb340027de2d81cdb35982ca55760e6d315c45 on randombit:master.
Thanks! The i386 CI failure looks related. Also out of curiosity can you compare benchmarks for Argon2 (the existing user of SIMD_2x64)
The i386 CI failure looks related.
Indeed it does. I'm not sure why, but functions using intrinsics in both SIMD_4x32 and SIMD_2x64 require ISA tags now. Previously only some functions were tagged. I added missing tags in dbd26e0b3d109a0e75a22b14f602d14e5cf08dd3 and a4eb7e00efe03fd63946714d69da54887c2a2ce6, but please confirm that this is the expected behavior.
Also out of curiosity can you compare benchmarks for Argon2
Actually, Argon2 module, as configured now, doesn't pick up SIMD_2x64 for SIMD128. I have a patch ready for it (87c0cacb3cc700ed64472e5c13da198b3675f6da) which I plan to upstream after this lands.