llama.cpp
llama.cpp copied to clipboard
Server example not working - failing on Runtime Error unexpected EOF
Expected Behavior
Working server example.
Current Behavior
Fails when loading
llama.cpp: loading model from models/7B/ggml-model.bin libc++abi: terminating with uncaught exception of type std::runtime_error: unexpectedly reached end of file [1] 66278 abort ./server -m models/7B/ggml-model.bin
I had to change the model to -m models/7B/ggml-model.bin as the default value didn't take into account the directory where the models are stored.
Environment and Context
MAC OSX M2
./main works well with the same model file
Requantize your model to the lastest version, and update use the latest server example release
Hi, Is this the right place to ask support questions? if not please address me to the right place. I'm experiencing the same problem. After compiling without any problems I follow these steps
- download model
wget https://huggingface.co/Pi3141/alpaca-7B-ggml/resolve/main/ggml-model-q4_0.bin
and place it in/models/7B/ggml-model-q4_0.bin
- apply python convert.py models/7B/
after typing
./main -m ./models/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512
llama. cpp: loading model from ./models/7B/ggml-model-q4_0.bin
error loading model: unexpectedly reached the end of file
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/7B/ggml-model-q4_0.bin'
main: error: unable to load model
I'm using rev 2d43387dafe9c60f15f57aa23
Could you please suggest me a way to address the problem ?
both have been done, but I'll pull again and report back
If you are trying to get the latest version of llama.cpp to work, you can download this model created yesterday: selfee-13b.ggmlv3.q2_K.bin.
Put the model in the models directory and run:
make -j && ./main -m ./models/selfee-13b.ggmlv3.q2_K.bin -p "Building a website can be done in 10 simple steps:" -n 512
I'm running on a mac M1.
If you read over this thread from yesterday, it led me to a working model using the new quantization method mentioned: https://github.com/ggerganov/llama.cpp/pull/1684#issuecomment-1578585886
Thanks @cmann50 that model works for me too. For me issue can be closed
Hi, guys. I hit this issue in most of the models. Is there exist any smaller size model(<200MB) we can use for the test?
AFAIK, 7B is the smallest llama model trained on a trillion tokens...
This issue was closed because it has been inactive for 14 days since being marked as stale.