XNNPACK
XNNPACK copied to clipboard
High-efficiency floating-point neural network inference operators for mobile, server, and Web
This PR is a follow-up work for https://github.com/google/XNNPACK/pull/6771 Performance data for reference: https://github.com/google/XNNPACK/pull/6739
Copybara import of the project: -- 423f8fe98268e10270517c51a26f7dab74a0946a by kaustubh-raste : Resolved conflicts FUTURE_COPYBARA_INTEGRATE_REVIEW=https://github.com/google/XNNPACK/pull/6533 from imaginationtech:img_patch11 9352759cdd40ef60502de3bf38d1e4db0cb458e0
Use the header-driven approach to drive the packw/packb/packx/zerob tests and benchmarks; remove the yaml files, generation scripts, and "do not edit" comments as appropriate.
More accurate set of coefficients for `f32-vtanh`. The new kernels are 5-10% slower, but are now accurate to within 4 ulp, without exception.
Copybara import of the project: -- 28f3b893272b739b9283e195b4448c305bfa978f by Chenyu Yang : add qs8 qc8w gemm avxvnni prfm e2e bench FUTURE_COPYBARA_INTEGRATE_REVIEW=https://github.com/google/XNNPACK/pull/6898 from Ch3nYuY:qs8-gemm-avxvnni-prfm-e2e-bench 28f3b893272b739b9283e195b4448c305bfa978f
Fix dropped benchmark