llama.cpp ggml : Load data into int8x16x4_t using vld4q

ggml : Load data into int8x16x4_t using vld4q_s8 on arm64

Open lindeer opened this issue 1 year ago • 0 comments

checkout latest 5c64a09@master, compilation report:

/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1262:43: error: invalid initializer
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1263:43: error: invalid initializer
             const int8x16x4_t q8bytes_2 = vld1q_s8_x4(q8); q8 += 64;
                                           ^~~~~~~~~~~
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c: In function ‘ggml_vec_dot_q5_K_q8_K’:
/home/wesley/Work/projects/chatgpt/llama.cpp/k_quants.c:1791:41: error: invalid initializer
             const int8x16x4_t q8bytes = vld1q_s8_x4(q8); q8 += 64;
                                         ^~~~~~~~~~~

but everything is OK on x86_64, maybe arm64 does not support this intrinsic? Using vld4q_s8 instead of vld1q_u8_x4 seems working, both on x86_64 and arm64. However, testing did not pass all due to issue #1736 .

$ make test
Running tests...
Test project /home/wesley/Work/projects/chatgpt/llama.cpp/build
    Start 1: test-quantize-fns
1/4 Test #1: test-quantize-fns ................   Passed    0.01 sec
    Start 2: test-quantize-perf
2/4 Test #2: test-quantize-perf ...............   Passed    0.01 sec
    Start 3: test-sampling
3/4 Test #3: test-sampling ....................Child aborted***Exception:   0.05 sec
    Start 4: test-tokenizer-0
4/4 Test #4: test-tokenizer-0 .................   Passed    0.02 sec

75% tests passed, 1 tests failed out of 4

Total Test time (real) =   0.09 sec

The following tests FAILED:
	  3 - test-sampling (Child aborted)
Errors while running CTest
make: *** [Makefile:84：test] 错误 8

Jun 07 '23 08:06 lindeer

llama.cpp llama.cpp copied to clipboard

ggml : Load data into int8x16x4_t using vld4q_s8 on arm64

llama.cpp
llama.cpp copied to clipboard