fbarchard

Results 83 comments of fbarchard

This could be a couple things i8mm is relatively new and your compiler version may be too old. with cmake the compiler version is checked and i8mm disabled if it...

or try not specifying 8.2 -march=armv8-a+dotprod+i8mm" i8mm is an option of armv8.5 and gcc may be picky and not allow you to specify it as an option for armv8.2 there...

a build produces a collection of libraries libaarch64_prod_microkernels_private.a libaarch64_prod_microkernels_private.pic.a libfp16arith_prod_microkernels_private.a libfp16arith_prod_microkernels_private.pic.a libneon_aarch64_prod_microkernels_private.a libneon_aarch64_prod_microkernels_private.pic.a libneonbf16_prod_microkernels_private.a libneonbf16_prod_microkernels_private.pic.a libneondot_aarch64_prod_microkernels_private.a libneondot_aarch64_prod_microkernels_private.pic.a libneondotfp16arith_prod_microkernels_private.a libneondotfp16arith_prod_microkernels_private.pic.a libneondot_prod_microkernels_private.a libneondot_prod_microkernels_private.pic.a libneonfma_aarch64_prod_microkernels_private.a libneonfma_aarch64_prod_microkernels_private.pic.a libneonfma_prod_microkernels_private.a libneonfma_prod_microkernels_private.pic.a libneonfp16arith_aarch64_prod_microkernels_private.a libneonfp16arith_aarch64_prod_microkernels_private.pic.a libneonfp16arith_prod_microkernels_private.a libneonfp16arith_prod_microkernels_private.pic.a libneonfp16_prod_microkernels_private.a...

Our expectation at the moment is a .S file is handled by clang or gcc, which internally will use gas. The code is ATT syntax and has a few directives...

Also note that in ARM v9 with sve, i8mm is optional and technically needs to be detected. i8mm requires a fairly new arm cpu. Its in the Pixel 8, which...

https://github.com/pytorch/cpuinfo/issues/238 is for nnpack? Did you mean xnnpack? The change looks okay. I've seen similar issues on arm where cores are unavailable, but for different reasons on low end devices...

This PR caused crashes on Android and Linux platforms https://github.com/pytorch/cpuinfo/issues/339 So I've prepared a PR to rollback to the previous version https://github.com/pytorch/cpuinfo/pull/340

a quick test of armv7 builds okay locally for me. I did a clang build with bazel. Are you using cmake or a different compiler? int8x4 and armsimd is a...

In general a kernel with 'u4v' means 'm4' for the source. Kernels such as float binary ops, can implement all 4 variations - m1, m2, m4, m8. In the src/configs/gemm-config.c...

Thanks for the report. Is it possible to make a godbolt.org reproducible? Any suggestion on a fix? The params->scalar.min is fp16 but in avx2 we want to use vpbroadcastw from...