fbarchard comments

Results 83 comments of


                                            fbarchard

Can RVV Kernels be enable by default?

To configure microkernels we need to run benchmarks on RVV hardware. The kernels support m1, m2, m4 and m8 and normally we'd run the benchmark, select the fastest and plug...

will you Plan to support int8 perchannel quantize for linear op?

By linear do you mean QD8 GEMM with linear instead of minmax?

RISC-V cpuinfo build error

I ran into the same issue on Intel I think, and hacked a solution by doing the syscall in assembly. ``` #if XNN_ARCH_X86_64 && defined(__linux__) ssize_t xnn_syscall(size_t rax, size_t rdi,...

external/XNNPACK: optimization

I am, but time frame is roughly by end of year. I plan to focus on a full set of fp32 microkernels first.

Load-from-misaligned-address failures on Hexagon simulator

8 bit (or 4 bit) weights can cause an alignment issue for bias and scale that are 32 bit elements and usually vectors. dwconv is an igemm. igemm is a...

Load-from-misaligned-address failures on Hexagon simulator

If the multipass specifically has the issue but single pass works, its likely the temporary accumulation buffer is not int32 aligned.

ARMv7 (with NEON) can not support on Linux but only support ARMv7 (with NEON) on Android

armsimd32 is ARMv6 style simd - 4 bytes. It provides optimization on cpus without NEON. In bazel there is a section with the build options applied: ``` xnnpack_cc_library( name =...

ARMv7 (with NEON) can not support on Linux but only support ARMv7 (with NEON) on Android

There is an armv7 script for android. When I tried it with NDK 21 it had a build error against I8MM due to an old version of clang being used,...

How can I parallelize the execution of this benchmark? (https://github.com/google/XNNPACK/blob/master/bench/spmm-benchmark.h)

The end2end_bench shows spmm on arm using threads.

Xnnpack still builds with `+dotprod` and `+fp16` with `-DXNNPACK_ENABLE_ARM_DOTPROD=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_FP16_VECTOR=OFF`

the build system determines which kernels to build. the macros reflect what was enabled and wont test/use the disabled kernels. with bazel there are flags to control each instruction set:...