XNNPACK icon indicating copy to clipboard operation
XNNPACK copied to clipboard

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Results 342 XNNPACK issues
Sort by recently updated
recently updated
newest added

Refactor `xnn_tensor_get_size` into a helper function `xnn_datatype_get_size_bits`. Also makes things a bit more generic (if we add 2 bit datatypes, it should just work, assuming they align the same way...

Add a script to automate the `vunary` benchmarks generation from the microkernels' test specification. This effectively adds benchmarks for `f32-vabs`, `f32-vneg`, and `f32-vsqr`.

QS8/QD8 GEMM/IGEMM on AVX2 use 2x8c8 instead of 3x8c8 - 3x8 spills registers and is slower than 2x8

Add iterative and non-iterative `vrsqrt` microkernels for SSE.

Add an XNNPACK delegate for the `Rsqrt` node in TFLite.