BrickBee comments

Results 4 comments of


                                            BrickBee

New IQ1_S somehow much worse than previous version

> But when attempting to run an imatrix calculation Same for me with some DeepSeek-based models, which Gorilla is based on. Inference for FP16 and Q8 works, but imatrix calculation...

Shape Error When Running Inference after Converting OpenLlama 3B to GGML

> error loading model: llama.cpp: tensor 'layers.0.feed_forward.w1.weight' has wrong shape; expected 3200 x 8704, got 3200 x 8640 Same for me. It is also broken in the original commit (ffb06a345e3a9e30d39aaa5b46a23201a74be6de),...

Shape Error When Running Inference after Converting OpenLlama 3B to GGML

I can confirm that the quantized files that you've linked work fine with the release version that you have linked. My quantized versions that I've created at the time of...

Shape Error When Running Inference after Converting OpenLlama 3B to GGML

Conversion and fp16 inference works after applying this [diff](https://huggingface.co/SlyEcho/open_llama_3b_ggml/blob/main/convert.py.diff). This was by the way the original point of this issue. The 3b model can't be used with the current code...