llama.cpp
llama.cpp copied to clipboard
Raspberry Pi 3 (raspian) compile fails
`
uname -a
Linux raspberrypi 5.15.84-v7+ #1613 SMP Thu Jan 5 11:59:48 GMT 2023 armv7l GNU/Linux
lscpu
Architecture: armv7l Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Vendor ID: ARM Model: 4 Model name: Cortex-A53 Stepping: r0p4 CPU max MHz: 1400.0000 CPU min MHz: 600.0000 BogoMIPS: 38.40 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
`
`
make
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: armv7l
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread
I LDFLAGS:
I CC: cc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110
I CXX: g++ (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110
cc -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations -c ggml.c -o ggml.o ggml.c: In function ‘quantize_row_q8_1’: ggml.c:1535:36: warning: implicit declaration of function ‘vcvtnq_s32_f32’; did you mean ‘vcvtq_s32_f32’? [-Wimplicit-function-declaration] 1535 | const int32x4_t vi = vcvtnq_s32_f32(v); | ^~~~~~~~~~~~~~ | vcvtq_s32_f32 ggml.c:1535:36: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int’ ggml.c:1548:36: error: incompatible types when initializing type ‘int32x4_t’ using type ‘int’ 1548 | const int32x4_t vi = vcvtnq_s32_f32(v); | ^~~~~~~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q4_0_q8_0’: ggml.c:2756:34: warning: implicit declaration of function ‘vuzp1q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration] 2756 | const int8x16_t v1_0ls = vuzp1q_s8(v1_0l, v1_0h); | ^~~~~~~~~ | vuzpq_s8 ggml.c:2756:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2757:34: warning: implicit declaration of function ‘vuzp2q_s8’; did you mean ‘vuzpq_s8’? [-Wimplicit-function-declaration] 2757 | const int8x16_t v1_0hs = vuzp2q_s8(v1_0l, v1_0h); | ^~~~~~~~~ | vuzpq_s8 ggml.c:2757:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2758:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2758 | const int8x16_t v1_1ls = vuzp1q_s8(v1_1l, v1_1h); | ^~~~~~~~~ ggml.c:2759:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2759 | const int8x16_t v1_1hs = vuzp2q_s8(v1_1l, v1_1h); | ^~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q4_1_q8_1’: ggml.c:2917:34: warning: implicit declaration of function ‘vzip1q_s8’; did you mean ‘vzip1_s8’? [-Wimplicit-function-declaration] 2917 | const int8x16_t v0_0lz = vzip1q_s8(v0_0l, v0_0h); | ^~~~~~~~~ | vzip1_s8 ggml.c:2917:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2918:34: warning: implicit declaration of function ‘vzip2q_s8’; did you mean ‘vzip2_s8’? [-Wimplicit-function-declaration] 2918 | const int8x16_t v0_0hz = vzip2q_s8(v0_0l, v0_0h); | ^~~~~~~~~ | vzip2_s8 ggml.c:2918:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ ggml.c:2919:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2919 | const int8x16_t v0_1lz = vzip1q_s8(v0_1l, v0_1h); | ^~~~~~~~~ ggml.c:2920:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 2920 | const int8x16_t v0_1hz = vzip2q_s8(v0_1l, v0_1h); | ^~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q4_2_q8_0’: ggml.c:3057:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3057 | const int8x16_t v0_0lz = vzip1q_s8(v0_0ls, v0_0hs); | ^~~~~~~~~ ggml.c:3058:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3058 | const int8x16_t v0_0hz = vzip2q_s8(v0_0ls, v0_0hs); | ^~~~~~~~~ ggml.c:3059:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3059 | const int8x16_t v0_1lz = vzip1q_s8(v0_1ls, v0_1hs); | ^~~~~~~~~ ggml.c:3060:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3060 | const int8x16_t v0_1hz = vzip2q_s8(v0_1ls, v0_1hs); | ^~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q4_3_q8_1’: ggml.c:3205:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3205 | const int8x16_t v0_0lz = vzip1q_s8(v0_0l, v0_0h); | ^~~~~~~~~ ggml.c:3206:34: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3206 | const int8x16_t v0_0hz = vzip2q_s8(v0_0l, v0_0h); | ^~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q5_0_q8_0’: ggml.c:3343:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3343 | const int8x16_t v0lz = vzip1q_s8(v0l, v0h); | ^~~~~~~~~ ggml.c:3344:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3344 | const int8x16_t v0hz = vzip2q_s8(v0l, v0h); | ^~~~~~~~~ ggml.c: In function ‘ggml_vec_dot_q5_1_q8_1’: ggml.c:3474:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3474 | const int8x16_t v0lz = vzip1q_s8(v0l, v0h); | ^~~~~~~~~ ggml.c:3475:32: error: incompatible types when initializing type ‘int8x16_t’ using type ‘int’ 3475 | const int8x16_t v0hz = vzip2q_s8(v0l, v0h); | ^~~~~~~~~ make: *** [Makefile:161: ggml.o] Error 1
`
Looks like gcc is missing some implementations for various arm-specific SIMD instructions
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95399
I found that it can be poly filled like they do in XNNPACK: https://github.com/google/XNNPACK/issues/1924#issuecomment-930139286
As for you @jtang613, you can try compiling with clang, it should have these functions defined
Can you try again and post the output after git pull
? There are new commits https://github.com/ggerganov/llama.cpp/commit/e8c051611abfc9a7f37fd4bba48217180893bd68 and https://github.com/ggerganov/llama.cpp/commit/c3ca7a5f0546c561eb278be3f2fe335795679e01 which fix lots of the issues you mentioned
Thanks @prusnak , the latest merge builds using 'make'.
fwiw, cmake still complains (see below). But as long as one method works, I'm happy.
`~/llama.cpp/build# cmake ..
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: armv7l
-- ARM detected
-- Configuring done
-- Generating done
-- Build files have been written to: /root/llama.cpp/build
root@raspberrypi:~/llama.cpp/build# cmake --build . --config Release Scanning dependencies of target ggml [ 3%] Building C object CMakeFiles/ggml.dir/ggml.c.o /root/llama.cpp/ggml.c:191:10: fatal error: immintrin.h: No such file or directory 191 | #include <immintrin.h> | ^~~~~~~~~~~~~ compilation terminated. gmake[2]: *** [CMakeFiles/ggml.dir/build.make:82: CMakeFiles/ggml.dir/ggml.c.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:797: CMakeFiles/ggml.dir/all] Error 2 gmake: *** [Makefile:114: all] Error 2 `
The CMake file is not complete for 32-bit ARM:
if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "arm" OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64")
message(STATUS "ARM detected")
if (MSVC)
# TODO: arm msvc?
else()
if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64")
add_compile_options(-mcpu=native)
endif()
# TODO: armv6,7,8 version specific flags
endif()
But it's possible to add the same flags in the configuration manually:
cmake . -DCMAKE_C_FLAGS="-mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations"
Added fix in https://github.com/ggerganov/llama.cpp/pull/1251
@jtang613 could you please test that PR if it fixes the issue for you using cmake?
Please, Anyone can confirm to me if build work fine on raspberry pi 3 under the OS " raspbian 32bits ".
Note: I want know if I can start a AI with low costing hardware ( using a lot of Raspberry Pi in a cluster setup )...