neural-speed icon indicating copy to clipboard operation
neural-speed copied to clipboard

Running Q4_K_M gguf models: unrecognized tensor type 12

Open shg8 opened this issue 1 year ago • 1 comments

Welcome to use the llama on the ITREX! 
AVX:1 AVX2:1 AVX512F:0 AVX_VNNI:1 AVX512_VNNI:0 AMX_INT8:0 AMX_BF16:0 AVX512_BF16:0 AVX512_FP16:0
Loading the bin file with GGUF format...
main: seed  = 1712361979
model.cpp: loading model from /models/llama-2-7b.Q4_K_S.gguf
error loading model: unrecognized tensor type 12

model_init_from_file: failed to load model

I got this error when trying to load the Q4_K_M and Q4_K_S quantized models for Llama-2-7B-GGUF. Would appreciate support could be added.

shg8 avatar Apr 06 '24 00:04 shg8

@shg8 Thanks for using the Neural Speed.

We don't support Qx_K_M and Qx_K_S currently. Sry about that. We will discuss and evalute this task.

Thanks again.

Zhenzhong1 avatar Apr 15 '24 08:04 Zhenzhong1