prompttools
prompttools copied to clipboard
It can't load with llama_load_model_from_file: failed to load model
⁉️ Discussion/Question
Hi.
I just start with prompttools. I have two problems. I will post separate them.
First, I couldn't load from downloaded model. There is error like below code when I tested LlamaCppExperiment.ipynb
. I try to different models and occurred same errors.
Environments
- CPU : M2
- RAM : 16GB
- OS : Ventura 13.2.1
- Python : 3.11.5
- prompttools : 0.0.43
Logs
gguf_init_from_file: invalid magic characters tjgg(�k.
error loading model: llama_model_loader: failed to load model from /Users/sewonist/Downloads/llama-2-7b-chat.ggmlv3.q2_K.bin
llama_load_model_from_file: failed to load model
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
ssertionError Traceback (most recent call last)
Cell In[3], line 1
----> 1 experiment.run()
File ~/Projects/13.AIChat/05.Projects/prompttools/prompttools/experiment/experiments/llama_cpp_experiment.py:177, in LlamaCppExperiment.run(self, runs)
175 latencies = []
176 for model_combo in self.model_argument_combos:
--> 177 client = Llama(**model_combo)
178 for call_combo in self.call_argument_combos:
179 for _ in range(runs):
File ~/anaconda3/envs/prompttools/lib/python3.11/site-packages/llama_cpp/llama.py:923, in Llama.__init__(self, model_path, n_gpu_layers, main_gpu, tensor_split, vocab_only, use_mmap, use_mlock, seed, n_ctx, n_batch, n_threads, n_threads_batch, rope_scaling_type, rope_freq_base, rope_freq_scale, yarn_ext_factor, yarn_attn_factor, yarn_beta_fast, yarn_beta_slow, yarn_orig_ctx, mul_mat_q, f16_kv, logits_all, embedding, last_n_tokens_size, lora_base, lora_scale, lora_path, numa, chat_format, chat_handler, verbose, **kwargs)
920 self.chat_format = chat_format
921 self.chat_handler = chat_handler
--> 923 self._n_vocab = self.n_vocab()
924 self._n_ctx = self.n_ctx()
926 self._token_nl = self.token_nl()
File ~/anaconda3/envs/prompttools/lib/python3.11/site-packages/llama_cpp/llama.py:2184, in Llama.n_vocab(self)
2182 def n_vocab(self) -> int:
2183 """Return the vocabulary size."""
-> 2184 return self._model.n_vocab()
File ~/anaconda3/envs/prompttools/lib/python3.11/site-packages/llama_cpp/llama.py:250, in _LlamaModel.n_vocab(self)
249 def n_vocab(self) -> int:
--> 250 assert self.model is not None
251 return llama_cpp.llama_n_vocab(self.model)
AssertionError:
I'm wonder LammaCpp doesn't support M series? Please let me know any ideas.
Thanks.
Have you downloaded the model llama-2-7b-chat.ggmlv3.q2_K.bin
and followed the setup instructions at https://github.com/ggerganov/llama.cpp and https://github.com/abetlen/llama-cpp-python?