ComputeLibrary
ComputeLibrary copied to clipboard
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
Hi guys, I'm extremelly interested to speed up int8 `MatMul` inference with ARM Compute Library kernel. My model is: ```mermaid graph TD; Input1["Input out: fp32"] Quantise1["NEQuantizationLayer out: signed int8"] Input2["Input...
**Output of 'strings libarm_compute.so | grep arm_compute_version':** arm_compute_version=v24.04 Build options: {'Werror': '1', 'build_dir': '//acl/build', 'debug': '0', 'neon': '1', 'opencl': '0', 'os': 'linux', 'openmp': '1', 'cppthreads': '0', 'arch': 'armv8.2-a', 'multi_isa': '1',...
With GCC 15, the build fails due to missing cstdint include. GCC 15 removed some extra includes in the standard headers and caused the issue. See https://gcc.gnu.org/gcc-15/porting_to.html for more information....
**Output of 'strings libarm_compute.so | grep arm_compute_version':** **Platform: Raspberry Pi 5** **Operating System: Raspberry Pi Bookworm OS** **GCC version:** g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/12/lto-wrapper Target: arm-linux-gnueabihf Configured with:...
Hi, I am bit of a beginner here and I need some help with this **Output of 'strings libarm_compute.so | grep arm_compute_version':** **Platform:** Raspberry Pi 5 (ARM Cortex A76) **Operating...
Hello, I am considering using the SVE instruction set to optimize GEMM operators. I found that although the repository has relevant codes, there is no example telling me how to...
Hello, I'm trying to make sense of the code, and stumbled onto [this piece of code](https://github.com/ARM-software/ComputeLibrary/blob/de7288cb71e6b9190f52e50a44ed68c309e4a041/src/cpu/kernels/CpuIm2ColKernel.cpp#L309): ```c++ _convolved_dims = scaled_dimensions(src->dimension(width_idx), dst->dimension(height_idx), _kernel_width, _kernel_height, _conv_info, _dilation); ``` Shouldn't it be `src->dimension(height_idx)`?...
Version: main, v24.08 **Platform: armv7a** **Operating System: Android** **Problem description:** ComputeLibrary wouldn't build for neon=1 arch=armv7a os=android with Android NDK r27b (LTS, latest at this moment), r26d (prevoius LTS) Build...
I can't find the mobilenet_v2_1.0_224.tgz to get the network description, weights etc.
Is it possible to support F32 dequantized output for `QASYMM8` \ `QASYMM8_SIGNED` inputs in NEConvolutionLayer / NEGEMMConvolutionLayer?