Nathan Moinvaziri
Nathan Moinvaziri
From what I'm seeing, pmull method will not perform well on Cortex-A series because it only has 1 pmull lane shared between multiple execution units.
All Apple ARM CPU's have multiple PMULL lanes, so the easiest thing to do would be to enable this only for macOS. OR we could have a function that reads...
https://gist.github.com/nmoinvaz/b56489b6643156df798ea8f04d1ceefd @dead2 what do you think?
MIDR_EL1 is not allowed from user mode at least on M3 it gives seg fault.
@KungFuJesus Did you see my last commit?
If we only enable this on macOS it can be a compile time ifdef.
This is Apple M3, macOS 15.6.1.
I am wondering if I should remove the `crc32_armv8_pmull` implementation since I think it finding a aarch64 CPU with fast PMULL (multiple PMULL execution units) without EOR3 (crypto extensions) would...
I asked AI (I know sometimes unreliable) and it couldn't find any. But if Dead2's RPI5 with Arm Cortex A76 (missing crypto extensions) doesn't perform well, I am not sure...