alpaca.cpp
alpaca.cpp copied to clipboard
how to generate "ggml-alpaca-7b-q4.bin" with LLaMa original "consolidated.00.pth"?
is there any way to generate 7B,13B or 30B instead of downloading it? i already have the original models.
@aicoat https://github.com/antimatter15/alpaca.cpp/blob/master/convert-pth-to-ggml.py
@diegomontoya already tried that, it will give me 4 parts on 30B named ggml-model-f16.bin to ggml-model-f16.bin.3 i used quantize.exe to convert it to bin but at the end alpaca gave me an error "bad magic"
D:\LLama_git\alpaca-win>chat -m D:\LLama_git\Models\30B\ggml-model-q4_0.bin
main: seed = 1679772113
llama_model_load: loading model from 'D:\LLama_git\Models\30B\ggml-model-q4_0.bin' - please wait ...
llama_model_load: invalid model file 'D:\LLama_git\Models\30B\ggml-model-q4_0.bin' (bad magic)
main: failed to load model from 'D:\LLama_git\Models\30B\ggml-model-q4_0.bin'
it should be one single bin file not multiple parts
You will have to re-compile 'chat' with line 34 changed into 4 (instead of 1), which is the number .bin
files for your generated 30B model.
Reference: https://github.com/antimatter15/alpaca.cpp/issues/94#issuecomment-1478362711
@mastr-ch13f thanks man , you saved me lots of GB! i changed them to this and it is now working for all models:
// determine number of model parts based on the dimension
static const std::map<int, int> LLAMA_N_PARTS = {
{ 4096, 1 },
{ 5120, 2 },
{ 6656, 4 },
{ 8192, 8 },
};