XNNPACK icon indicating copy to clipboard operation
XNNPACK copied to clipboard

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Results 342 XNNPACK issues
Sort by recently updated
recently updated
newest added

- Implements count leading zeros - Arm has direct instructions - x86/x64 - No direct instruction - converted to double and extracted the exponent to get clz and added condition...

Convert packing.c to C++ packing.cc We'd like to re-express a lot of these operations as compositions of helper functions, which is a lot easier to do if we use C++...

Prefetched the weights into the L1 cache in `xnn_f32_gemm_minmax_ukernel_5x16__fma3_broadcast`, resulting in an average performance improvement of over 3% across the MobileNet V1/V2/V3_Large/V3_Small models. ``` ---------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------------------------...

Convert the maxpool and avgpool tests to C++ parameterized tests; remove the no-longer-needed yaml and generation scripts.

The new `AVX_VNNI_INT8` instruction can avoid XOR operation in qs8-qc82 gemm/igemm kernel, resulting ~3% performance improvement on mobilenet v1 and v2 int8 models. ``` bash ------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations...

This change extends the xnn_define_blockwise_quantized_tensor_value API to accept flags to control block scale format, though only bf16 is currently supported. The intent of this change is to allow for other...

The type of xnn_table_exp2minus_k_over_16 varies between declarations. It's defined as `uint32_t[64]` [here](https://github.com/google/XNNPACK/blob/4f09c54b7445c3c23e84dfe6fc9cf1a56392f604/src/tables/exp2minus-k-over-64.c#L12) but other declarations have various different types. For example, [here](https://github.com/google/XNNPACK/blob/4f09c54b7445c3c23e84dfe6fc9cf1a56392f604/src/amalgam/gen/sse2.c#L2434) it's declared as `float[64]`. This is illegal in...

s32 minimum op support

Bitwise ops Implementations - AND - OR - XOR