llama.cpp
llama.cpp copied to clipboard
ggml : Load data into int8x16x4_t using vld4q_s8 on arm64
checkout latest 5c64a09@master, compilation report:
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1262:43: error: invalid initializer
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1263:43: error: invalid initializer
const int8x16x4_t q8bytes_2 = vld1q_s8_x4(q8); q8 += 64;
^~~~~~~~~~~
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c: In function ‘ggml_vec_dot_q5_K_q8_K’:
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1791:41: error: invalid initializer
const int8x16x4_t q8bytes = vld1q_s8_x4(q8); q8 += 64;
^~~~~~~~~~~
but everything is OK on x86_64, maybe arm64 does not support this intrinsic?
Using vld4q_s8
instead of vld1q_u8_x4
seems working, both on x86_64 and arm64.
However, testing did not pass all due to issue #1736 .
$ make test
Running tests...
Test project /home/wesley/Work/projects/chatgpt/llama.cpp/build
Start 1: test-quantize-fns
1/4 Test #1: test-quantize-fns ................ Passed 0.01 sec
Start 2: test-quantize-perf
2/4 Test #2: test-quantize-perf ............... Passed 0.01 sec
Start 3: test-sampling
3/4 Test #3: test-sampling ....................Child aborted***Exception: 0.05 sec
Start 4: test-tokenizer-0
4/4 Test #4: test-tokenizer-0 ................. Passed 0.02 sec
75% tests passed, 1 tests failed out of 4
Total Test time (real) = 0.09 sec
The following tests FAILED:
3 - test-sampling (Child aborted)
Errors while running CTest
make: *** [Makefile:84:test] 错误 8