llama.cpp
llama.cpp copied to clipboard
accuracy: Q4 matrix multiply error is very bad for small K
In benchmark/benchmark-q4_0-matmult.c:
Set sizey=sizez=N,sizex=K
For K=128,N=2, the deviation is expected 1020.00, got 1280.00
For K=128,N=32, the deviation is expected 262144.00, got 508160.03
For K=64,N=32 the deviation is expected 131072.00, got 476928.03
For K=32,N=32 the deviation is expected 65536.00, got 418560.03
This is rather worrying, although it does not practically affect any known use-case.