dot product disabled on aarch32
This change causes a performance regression on arm due to this PR https://github.com/pytorch/cpuinfo/pull/300
src/arm/linux/aarch32-isa.c ==== 152,153d151 < } else if (chipset->vendor == cpuinfo_arm_chipset_vendor_unknown) { < cpuinfo_log_warning("VDOT instructions disabled: unknown chipset");
chipset->vendor is unknown on many, if not most new SoC, on all cpus, including qualcomm, google tensor, exynos.
The problem the code atttempts to address is some unisoc/mediatek soc using a faulty linux kernel that marks udot instructions as illegal on Cortex A55. sdot is okay. aarch64 udot is okay. As far as I know, sudot, usdot and i8mm are okay. Other vendors either used a newer linux kernel, or patched the bug.
To avoid the issue, XNNPack doesnt use udot in either aarch32 or aarch64. But there is only 1 flag to detect both, and this isa->dot
Impact is a performance regression On Samsung S23 qualcomm medium core (A720) Was QS8MobileNetV2/real_time 10695 us Now QS8MobileNetV2/process_time/real_time 32577 us
Samsung S23 Qualcomm SoC name: Unknown Microarchitectures: 1x Cortex-X3 4x Cortex-A715 3x Cortex-A510 Cores: 0: 1 processor (0), ARM Cortex-X3 1: 1 processor (1), ARM Cortex-A715 2: 1 processor (2), ARM Cortex-A715 3: 1 processor (3), ARM Cortex-A715 4: 1 processor (4), ARM Cortex-A715 5: 1 processor (5), ARM Cortex-A510 6: 1 processor (6), ARM Cortex-A510 7: 1 processor (7), ARM Cortex-A510
Samsung S22 Exynos SoC name: Unknown Microarchitectures: 1x Cortex-X2 3x Cortex-A710 4x Cortex-A510 Cores: 0: 1 processor (0), ARM Cortex-X2 1: 1 processor (1), ARM Cortex-A710 2: 1 processor (2), ARM Cortex-A710 3: 1 processor (3), ARM Cortex-A710 4: 1 processor (4), ARM Cortex-A510 5: 1 processor (5), ARM Cortex-A510 6: 1 processor (6), ARM Cortex-A510 7: 1 processor (7), ARM Cortex-A510