BrickBee
BrickBee
> But when attempting to run an imatrix calculation Same for me with some DeepSeek-based models, which Gorilla is based on. Inference for FP16 and Q8 works, but imatrix calculation...
> error loading model: llama.cpp: tensor 'layers.0.feed_forward.w1.weight' has wrong shape; expected 3200 x 8704, got 3200 x 8640 Same for me. It is also broken in the original commit (ffb06a345e3a9e30d39aaa5b46a23201a74be6de),...
I can confirm that the quantized files that you've linked work fine with the release version that you have linked. My quantized versions that I've created at the time of...
Conversion and fp16 inference works after applying this [diff](https://huggingface.co/SlyEcho/open_llama_3b_ggml/blob/main/convert.py.diff). This was by the way the original point of this issue. The 3b model can't be used with the current code...