XNNPACK
XNNPACK copied to clipboard
enable AVX_VNNI_INT8 instruction for qd8-f32-qc8w-gemm/igemm
$ qd8_f32_qc8w_gemm_bench.exe
qd8_f32_qc8w_gemm_minmax_ukernel_1x8c8:
..._1x8c8__avxvnni_prfm/mobilenet_v1/M:12544/N:32/K:27/real_time 142256 ns 143173 ns 4911 OPS=152.373G/s <-- base
..._1x8c8__avxvnniint8_prfm/mobilenet_v1/M:12544/N:32/K:27/real_time 139960 ns 126325 ns 5566 OPS=154.873G/s <-- opt
qd8_f32_qc8w_gemm_minmax_ukernel_5x8c8:
..._5x8c8__avxvnni_prfm/mobilenet_v1/M:12544/N:32/K:27/real_time 109927 ns 109497 ns 6136 OPS=197.186G/s <-- base
..._5x8c8__avxvnniint8_prfm/mobilenet_v1/M:12544/N:32/K:27/real_time 99856 ns 103637 ns 7086 OPS=217.073G/s <-- opt