llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

llama_init_from_file: failed to load model

Open alisonzhu opened this issue 1 year ago • 1 comments

When I execute this command: make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512

An error was reported: llama_init_from_file: failed to load model main: error: failed to load model './models/7B/ggml-model-q4_0.bin'

alisonzhu avatar Mar 22 '23 10:03 alisonzhu

Please use the issue template when opening issues so we can better understand your problem.

gjmulder avatar Mar 22 '23 10:03 gjmulder

(i'm fench so sorry for my bad english)

Hello, i'm on ubuntu mate, distrib of linux i have python3.10. [uname] Linux ordival-mate 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux [python version] Python 3.10.6

I have the same error, i just paste this in my term:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model

# install Python dependencies
python3 -m pip install torch numpy sentencepiece

# convert the 7B model to ggml FP16 format
python3 convert-pth-to-ggml.py models/7B/ 1

# quantize the model to 4-bits
python3 quantize.py 7B

# run the inference
./main -m ./models/7B/ggml-model-q4_0.bin -n 128

And i have error: 65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model 65B: command not found

And when i run this command i have the error "fail to load model": ~/llama.cpp$ make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512 I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread `I LDFLAGS:
I CC: cc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 I CXX: g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

make: Nothing to be done for 'default'. main: seed = 1679618071 llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ... llama_model_load: failed to open './models/7B/ggml-model-q4_0.bin' llama_init_from_file: failed to load model main: error: failed to load model './models/7B/ggml-model-q4_0.bin'`

daisseur avatar Mar 24 '23 00:03 daisseur

# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model

This part is saying that you’ll need to find the model files yourself and put them in the models folder. We can’t help with that part, but once you have them downloaded the commands after that should work.

j-f1 avatar Mar 24 '23 02:03 j-f1

(i'm fench so sorry for my bad english)

Hello, i'm on ubuntu mate, distrib of linux i have python3.10. [uname] Linux ordival-mate 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux [python version] Python 3.10.6

I have the same error, i just paste this in my term:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model

# install Python dependencies
python3 -m pip install torch numpy sentencepiece

# convert the 7B model to ggml FP16 format
python3 convert-pth-to-ggml.py models/7B/ 1

# quantize the model to 4-bits
python3 quantize.py 7B

# run the inference
./main -m ./models/7B/ggml-model-q4_0.bin -n 128

And i have error: 65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model 65B: command not found

And when i run this command i have the error "fail to load model": ~/llama.cpp$ make -j && ./main -m ./models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512 I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread `I LDFLAGS: I CC: cc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 I CXX: g++ (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0

make: Nothing to be done for 'default'. main: seed = 1679618071 llama_model_load: loading model from './models/7B/ggml-model-q4_0.bin' - please wait ... llama_model_load: failed to open './models/7B/ggml-model-q4_0.bin' llama_init_from_file: failed to load model main: error: failed to load model './models/7B/ggml-model-q4_0.bin'`

Hello, i'll explain in french. Tu ne dois pas éxécuter la commande "65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model 65B". Cette ligne décrit ce que l'ordinateur doit afficher après avoir éxécuté la commande "ls ./models" Dans le dossier "models" tu dois avoir au moins un des 4 dossiers qui s'appellent "65B", "30B"... ect, et qui correspond a la version du modèle de language que tu as téléchargé. Aussi, tu devras chercher dans un des dossiers le fichier "quantize" ; je te conseille de le copier et de le coller dans le dossier principal "llama.cpp" et ensuite d'éxécuter la dernière commande.

adam-the-hacker avatar Apr 11 '23 17:04 adam-the-hacker