llama.cpp
llama.cpp copied to clipboard
GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)
im try to convert this ggml to gguf but i got this error .thank you
python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf" INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm,hfft', verbose=False) WARNING:ggml-to-gguf:=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING === INFO:ggml-to-gguf:* Scanning GGML input file INFO:ggml-to-gguf:* File format: GGJTv3 with ftype MOSTLY_Q2_K INFO:ggml-to-gguf:* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=5120, n_mult=256, n_head=40, n_layer=40, n_rot=128, n_ff=13824, ftype=MOSTLY_Q2_K> WARNING:ggml-to-gguf: === WARNING === Special tokens may not be converted correctly. Use --model-metadata-dir if possible === WARNING ===
INFO:ggml-to-gguf:- Guessed n_kv_head = 5 based on GQA 8
INFO:ggml-to-gguf:* Preparing to save GGUF file
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:ggml-to-gguf:* Adding model parameters and KV items
INFO:ggml-to-gguf:* Adding 32000 vocab item(s)
INFO:ggml-to-gguf:* Adding 363 tensor(s)
Traceback (most recent call last):
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 450, in