XNNPACK issues

Count leading zeros

- Implements count leading zeros - Arm has direct instructions - x86/x64 - No direct instruction - converted to double and extracted the exponent to get clz and added condition...

umadevimcw

Convert packing.c to C++ packing.cc

Convert packing.c to C++ packing.cc We'd like to re-express a lot of these operations as compositions of helper functions, which is a lot easier to do if we use C++...

copybara-service[bot]

add f32-gemm-5x16-minmax-fma3-broadcast-prfm microkernel

2

Prefetched the weights into the L1 cache in `xnn_f32_gemm_minmax_ukernel_5x16__fma3_broadcast`, resulting in an average performance improvement of over 3% across the MobileNet V1/V2/V3_Large/V3_Small models. ``` ---------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------------------------...

Ch3nYuY

Convert the maxpool and avgpool tests to C++ parameterized tests; remove the no-longer-needed yaml and generation scripts.

copybara-service[bot]

enable AVX_VNNI_INT8 instruction for qs8-qc8w-gemm/igemm

1

The new `AVX_VNNI_INT8` instruction can avoid XOR operation in qs8-qc82 gemm/igemm kernel, resulting ~3% performance improvement on mobilenet v1 and v2 int8 models. ``` bash ------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations...

xujuntwt95329

Introduce flags for qb4 scale format in xnn_define_blockwise_quantized_tensor_value

This change extends the xnn_define_blockwise_quantized_tensor_value API to accept flags to control block scale format, though only bf16 is currently supported. The intent of this change is to allow for other...

GregoryComer

Xnn f32 vrem

umadevimcw

xnn_table_exp2minus_k_over_16 has inconsistent types

7

The type of xnn_table_exp2minus_k_over_16 varies between declarations. It's defined as `uint32_t[64]` [here](https://github.com/google/XNNPACK/blob/4f09c54b7445c3c23e84dfe6fc9cf1a56392f604/src/tables/exp2minus-k-over-64.c#L12) but other declarations have various different types. For example, [here](https://github.com/google/XNNPACK/blob/4f09c54b7445c3c23e84dfe6fc9cf1a56392f604/src/amalgam/gen/sse2.c#L2434) it's declared as `float[64]`. This is illegal in...

aiusepsi

Xnn s32 vminimum

s32 minimum op support

umadevimcw

Xnn bitwise ops

Bitwise ops Implementations - AND - OR - XOR

umadevimcw

XNNPACK
XNNPACK copied to clipboard

Metadata

Count leading zeros

Convert packing.c to C++ packing.cc

add f32-gemm-5x16-minmax-fma3-broadcast-prfm microkernel

Convert the maxpool and avgpool tests to C++ parameterized tests; remove the no-longer-needed yaml and generation scripts.

enable AVX_VNNI_INT8 instruction for qs8-qc8w-gemm/igemm

Introduce flags for qb4 scale format in xnn_define_blockwise_quantized_tensor_value

Xnn f32 vrem

xnn_table_exp2minus_k_over_16 has inconsistent types

Xnn s32 vminimum

Xnn bitwise ops

← Metadata

Owner

Metadata

XNNPACK XNNPACK copied to clipboard

Metadata

← Metadata

Owner

Metadata

XNNPACK
XNNPACK copied to clipboard