liboqs
liboqs copied to clipboard
Inconsistencies in optimizations on ARM platforms
I had to do some benchmarking of liboqs on an ARM Cortex A-53 a couple of weeks ago and found a variety of weird things in how various optimizations are configured on ARM that I think is suboptimal. I'll try to quickly note them here, but I might not be recalling all of them quite right.
- In src/common/aes/aes.c, the C_OR_NI_OR_ARM macro doesn't seem to properly address the case of distributable builds on ARM, it will revert to the C code always in that case.
- Are the OQS_USE_ARM_*_INSTRUCTIONS CMake flags actually being set automatically? In the various builds I think there were times when they weren't set (I realize that's not very actionable feedback).
- In tests/system_info.c, ideally we would use the same macros from src/common/aes/aes.c and src/common/sha2/sha2.c (possibly by including them rather than copying and pasting) to indicate whether C or NI or ARM is being used, rather than having slightly different versions here, to be sure that the diagnostic information in tests/system_info.c actually matches the behaviour in the code.
- In the profiling scripts for ARM (https://github.com/open-quantum-safe/profiling/blob/main/perf/Dockerfile-arm64-start), why is there a toolchain file to enable cross compiling? I think our CMake configuration is set up so that if you're cross compiling an a build with OQS_OPT_TARGET=auto then you immediately switch back to OQS_OPT_TARGET=generic since you can't properly detect CPU optimizations.
- In general I think we need to go through and do some sanity checking of which optimizations are actually being enabled in our different build configurations on our various platforms, both for builds that users would make by following our instructions and the builds in the profiling system, since those might be built slightly differently (i.e., the cross compilation issue in item 4).
In the profiling scripts for ARM (https://github.com/open-quantum-safe/profiling/blob/main/perf/Dockerfile-arm64-start), why is there a toolchain file to enable cross compiling?
This is such that we can cross-build ARM docker images within an x64 CCI build VM. The reason for that was that "natively" building ARM64 docker images failed on CCI at that time. Will check again whether this is still the case.