llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Server example not working - failing on Runtime Error unexpected EOF

Open BartlomiejLewandowski opened this issue 1 year ago • 8 comments

Expected Behavior

Working server example.

Current Behavior

Fails when loading

llama.cpp: loading model from models/7B/ggml-model.bin libc++abi: terminating with uncaught exception of type std::runtime_error: unexpectedly reached end of file [1] 66278 abort ./server -m models/7B/ggml-model.bin

I had to change the model to -m models/7B/ggml-model.bin as the default value didn't take into account the directory where the models are stored.

Environment and Context

MAC OSX M2

BartlomiejLewandowski avatar Jun 05 '23 18:06 BartlomiejLewandowski

./main works well with the same model file

BartlomiejLewandowski avatar Jun 05 '23 18:06 BartlomiejLewandowski

Requantize your model to the lastest version, and update use the latest server example release

FSSRepo avatar Jun 05 '23 22:06 FSSRepo

Hi, Is this the right place to ask support questions? if not please address me to the right place. I'm experiencing the same problem. After compiling without any problems I follow these steps

  • download model wget https://huggingface.co/Pi3141/alpaca-7B-ggml/resolve/main/ggml-model-q4_0.bin and place it in /models/7B/ggml-model-q4_0.bin
  • apply python convert.py models/7B/ after typing ./main -m ./models/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512
llama. cpp: loading model from ./models/7B/ggml-model-q4_0.bin
error loading model: unexpectedly reached the end of file
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/7B/ggml-model-q4_0.bin'
main: error: unable to load model

I'm using rev 2d43387dafe9c60f15f57aa23

Could you please suggest me a way to address the problem ?

fbalicchia avatar Jun 06 '23 16:06 fbalicchia

both have been done, but I'll pull again and report back

BartlomiejLewandowski avatar Jun 06 '23 19:06 BartlomiejLewandowski

If you are trying to get the latest version of llama.cpp to work, you can download this model created yesterday: selfee-13b.ggmlv3.q2_K.bin.

Put the model in the models directory and run:

make -j && ./main -m ./models/selfee-13b.ggmlv3.q2_K.bin -p "Building a website can be done in 10 simple steps:" -n 512

I'm running on a mac M1.

If you read over this thread from yesterday, it led me to a working model using the new quantization method mentioned: https://github.com/ggerganov/llama.cpp/pull/1684#issuecomment-1578585886

cmann50 avatar Jun 07 '23 17:06 cmann50

Thanks @cmann50 that model works for me too. For me issue can be closed

fbalicchia avatar Jun 07 '23 20:06 fbalicchia

Hi, guys. I hit this issue in most of the models. Is there exist any smaller size model(<200MB) we can use for the test?

Aisuko avatar Jun 10 '23 13:06 Aisuko

AFAIK, 7B is the smallest llama model trained on a trillion tokens...

tlkahn avatar Jun 12 '23 12:06 tlkahn

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Apr 10 '24 01:04 github-actions[bot]