XNNPACK
XNNPACK copied to clipboard
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Internal script change
F16-RMAX microkernel using AVX512 FP16 arithmetics - Add build support for avx512fp16
Default condition is missing for xnnpack_aggregate_library. This causes TensorFlow build to fail on s390x
x8-packw-x16c4 call x32 packw-x16
Switch to the new `rational_9_6` microkernels for `f32-vtanh` on `x86` and `x86_64`.
Add partial support for building/testing/benchmarking XNNPACK on Hexagon. Additional work would need to be done to get this fully working in the Bazel build (notably, connecting to a Qualcomm SDK)...
Enable subconv path for DQ TransposeConv
AVX512FP16 - add compiler flag guard around fp16 code
Add partial support for building/testing/benchmarking XNNPACK on Hexagon. Additional work would need to be done to get this fully working in the Bazel build (notably, connecting to a Qualcomm SDK)...
f16 & f32 generic mean operator & subgraph