XNNPACK issues

Fix crash in Android Emulator with armeabi-v7a.

Fix crash of TensorFlow Lite in function xnn_create_convert_nc_qs8 due to dereferencing null pointer in Android Emulator on armeabi-v7a.

API92

QB4W MLAL GEMM Kernels

1

This pull request adds blockwise 4-bit (qb4w) GEMM microkernels targeting ARM Neon via the MLAL instruction family. Note: This PR includes one commit from https://github.com/google/XNNPACK/pull/6557 (Test generation update for qb4w)....

GregoryComer

QB4W SSE2/SSE41 GEMM Kernels

1

This pull requests adds blockwise 4-bit (qb4w) GEMM microkernels targetinsg x86 SSE2 and SSE4.1 Instruction Family. Note: This PR includes one commit from https://github.com/google/XNNPACK/pull/6557 (Test generation update for qb4w). I'm...

mcr229

Add f32 rsum RVV implementation microkernels, tests and config changes for LMUL 2, 4 and 8

8

KaustubhIMG

Enable RVV GEMM/IGEMM 7 x m4 in operator config

3

This PR aims to enable RVV GEMM/IGEMM/X32-PACKW in GEMM config. It leads to enabling RVV implementation in operator API.

bhbruce

QB4W AVX GEMM Kernels

This pull request adds blockwise 4-bit (qb4w) GEMM microkernels targeting x86 AVX instruction family. Note: Since AVX1 Ukernels share the same meta kernels as SSE2/4.1 kernels, this PR sits ontop...

mcr229

QB4W AVX2 GEMM Kernels

2

This pull request adds blockwise 4-bit (qb4w) GEMM microkernels targeting x86 via the AVX2 instruction family. Note: This PR includes one commit from https://github.com/google/XNNPACK/pull/6557 (Test generation update for qb4w). I'm...

GregoryComer

4x16s4 fp32-gemm kernel have better performance than default(5x16) kernel for meteor lake

1

XNNPACK by default uses 5x16 fp32-gemm kernel for `x86_fma3`, but we found that 4x16s4 kernel shows better performance on `meteor lake` CPU (`Intel(R) Core(TM) Ultra 7 155H`) | benchmark |...

xujuntwt95329

Failed to compile XNNPACK on WoA(Windows on ARM) device.

5

It seems part of the code haven't been compiled. Any idea on how to fix it? Thanks in advance! ``` FAILED: subgraph-size-test.exe C:\windows\system32\cmd.exe /C "cd . && C:\Programs\Python\Python311-arm64\Lib\site-packages\cmake\data\bin\cmake.exe -E vs_link_exe...

zhanweiw

Added standalone rsum HVX

copybara-service[bot]

XNNPACK
XNNPACK copied to clipboard

Metadata

Fix crash in Android Emulator with armeabi-v7a.

QB4W MLAL GEMM Kernels

QB4W SSE2/SSE41 GEMM Kernels

Add f32 rsum RVV implementation microkernels, tests and config changes for LMUL 2, 4 and 8

Enable RVV GEMM/IGEMM 7 x m4 in operator config

QB4W AVX GEMM Kernels

QB4W AVX2 GEMM Kernels

4x16s4 fp32-gemm kernel have better performance than default(5x16) kernel for meteor lake

Failed to compile XNNPACK on WoA(Windows on ARM) device.

Added standalone rsum HVX

← Metadata

Owner

Metadata

XNNPACK XNNPACK copied to clipboard

Metadata

← Metadata

Owner

Metadata

XNNPACK
XNNPACK copied to clipboard