dalai icon indicating copy to clipboard operation
dalai copied to clipboard

"Failed to quantize model" with every model but the 7B one

Open serovar opened this issue 1 year ago • 7 comments

With the previous version I successfully downloaded the 65B model and quantized it. With the latest version I get this error with every model but the 7B one: bash-3.2$ ./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2 llama_model_quantize: loading model from './models/13B/ggml-model-f16.bin.1' llama_model_quantize: failed to open "./models/13B/ggml-model-f16.bin.1' for reading main: failed to quantize model from './models/13B/ggml-model-f16.bin.1'

In the models folder the only file with this kind of name present is ggml-model-f16.bin, without the numbers in the end, and this one is quantized without errors, so it seems that the issue is that the script does not create these successive ggml-model-f16.bin.1, .2, .3 files and so on.

I am using a MacBook Pro with M1 Pro and 16GB of RAM, macOS v13.2.1

serovar avatar Mar 14 '23 10:03 serovar

I am having the same issue on windows 10, core i5-8600K, 32GB of RAM, GTX 3080

PladsElsker avatar Mar 17 '23 01:03 PladsElsker

similar issue

MarcusSi2023 avatar Mar 21 '23 20:03 MarcusSi2023

I'm having this issue but 7B won't quantize

nathanielastudillo avatar Mar 22 '23 03:03 nathanielastudillo

find the quantize executable for simplicity copy it to your folder wher ggml-model-f16.bin is and run ./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2

or on windows quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2 it takes a few minutes

fritol avatar Mar 31 '23 01:03 fritol

In my case : Ryzen 9 - Ubu 22.10 alpaca 7B [running good] and llama 7B [not loading ...yet]

[...]
/root/dalai/venv/bin/python convert-pth-to-ggml.py models/7B/ 1
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': -1}
Namespace(dir_model='models/7B/', ftype=1, vocab_only=0)
n_parts = 1
Processing part 0
[...]
  • it seems that using just models 7B from Alpaca/llama - no ggml-model-f16.bin file was produced. the only file found with a similar name is : ggml-vocab.bin [at same level as folder 7B in /llama/models. [432,6 Ko] I've copied it to the quantize folder and tried to run : sudo ./quantize ggml-vocab.bin ggml-model-q4_0.bin 2 but process has failed. (Same result while renaming it "***-f16.bin )
llama_model_quantize: loading model from 'ggml-vocab.bin'
llama_model_quantize: invalid model file 'ggml-vocab.bin' (bad magic)
main: failed to quantize model from 'ggml-vocab.bin'

any other work arround welcomed, thanks ;)

TERIIHUB avatar Apr 14 '23 01:04 TERIIHUB

same issue on win 10 (docker)

M0Rph3U56031769 avatar May 02 '23 11:05 M0Rph3U56031769

find the quantize executable for simplicity copy it to your folder wher ggml-model-f16.bin is and run ./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2

or on windows quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2 it takes a few minutes

Listen to this guy!

githubib avatar Jul 24 '23 00:07 githubib