dalai
dalai copied to clipboard
"Failed to quantize model" with every model but the 7B one
With the previous version I successfully downloaded the 65B model and quantized it. With the latest version I get this error with every model but the 7B one:
bash-3.2$ ./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2 llama_model_quantize: loading model from './models/13B/ggml-model-f16.bin.1' llama_model_quantize: failed to open "./models/13B/ggml-model-f16.bin.1' for reading main: failed to quantize model from './models/13B/ggml-model-f16.bin.1'
In the models folder the only file with this kind of name present is ggml-model-f16.bin, without the numbers in the end, and this one is quantized without errors, so it seems that the issue is that the script does not create these successive ggml-model-f16.bin.1, .2, .3 files and so on.
I am using a MacBook Pro with M1 Pro and 16GB of RAM, macOS v13.2.1
I am having the same issue on windows 10, core i5-8600K, 32GB of RAM, GTX 3080
similar issue
I'm having this issue but 7B won't quantize
find the quantize executable for simplicity copy it to your folder wher ggml-model-f16.bin is and run ./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2
or on windows quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2 it takes a few minutes
In my case : Ryzen 9 - Ubu 22.10 alpaca 7B [running good] and llama 7B [not loading ...yet]
[...]
/root/dalai/venv/bin/python convert-pth-to-ggml.py models/7B/ 1
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': -1}
Namespace(dir_model='models/7B/', ftype=1, vocab_only=0)
n_parts = 1
Processing part 0
[...]
- it seems that using just models 7B from Alpaca/llama - no ggml-model-f16.bin file was produced.
the only file found with a similar name is : ggml-vocab.bin [at same level as folder 7B in /llama/models. [432,6 Ko]
I've copied it to the quantize folder and tried to run :
sudo ./quantize ggml-vocab.bin ggml-model-q4_0.bin 2
but process has failed. (Same result while renaming it "***-f16.bin )
llama_model_quantize: loading model from 'ggml-vocab.bin'
llama_model_quantize: invalid model file 'ggml-vocab.bin' (bad magic)
main: failed to quantize model from 'ggml-vocab.bin'
any other work arround welcomed, thanks ;)
same issue on win 10 (docker)
find the quantize executable for simplicity copy it to your folder wher ggml-model-f16.bin is and run ./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2
or on windows quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2 it takes a few minutes
Listen to this guy!