private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

raise NameError(f"Could not load Llama model from path: {model_path}") NameError: Could not load Llama model from path: models/ggml-model-q4_0.bin

Open b007zk opened this issue 1 year ago • 16 comments

PS D:\privateGPT> python .\privateGPT.py llama.cpp: loading model from models/ggml-model-q4_0.bin llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 1000 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 2 (mostly Q4_0) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305) llama_init_from_file: failed to load model Traceback (most recent call last): File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\llamacpp.py", line 78, in validate_environment values["client"] = Llama( ^^^^^^ File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_cpp\llama.py", line 161, in init assert self.ctx is not None ^^^^^^^^^^^^^^^^^^^^ AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\privateGPT\privateGPT.py", line 57, in main() File "D:\privateGPT\privateGPT.py", line 21, in main llama = LlamaCppEmbeddings(model_path=llama_embeddings_model, n_ctx=model_n_ctx) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init File "pydantic\main.py", line 1102, in pydantic.main.validate_model File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\llamacpp.py", line 98, in validate_environment raise NameError(f"Could not load Llama model from path: {model_path}") NameError: Could not load Llama model from path: models/ggml-model-q4_0.bin

b007zk avatar May 14 '23 21:05 b007zk

Did you download and place the model in the models folder?

BassAzayda avatar May 14 '23 21:05 BassAzayda

@BassAzayda Yup the models are in a folder called models.

And this is what's in my .env :

PERSIST_DIRECTORY=db LLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin MODEL_TYPE=GPT4All MODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin MODEL_N_CTX=1000

b007zk avatar May 14 '23 22:05 b007zk

If using Windows could it be the slashes need to be the other way? Sorry Mac user here?

BassAzayda avatar May 14 '23 22:05 BassAzayda

@BassAzayda I thought about that as a possibility, which is why I'm trying it also on WSL (Ubuntu).

b007zk avatar May 14 '23 22:05 b007zk

@BassAzayda Unfortunately changing the slashes doesn't work :( Nor does providing the full path. It seems like the path isn't the problem here but rather an actual problem with loading the model due to the version of python and the library being used.

b007zk avatar May 14 '23 22:05 b007zk

@imartinez Any solution to this?

b007zk avatar May 14 '23 22:05 b007zk

You have to have the same python version, if it is a python version related problem. I think someone with a similar problem solved it by updating his to latest.

d2rgaming-9000 avatar May 14 '23 22:05 d2rgaming-9000

@d2rgaming-9000 i do have the latest Python 3.11.3

b007zk avatar May 14 '23 23:05 b007zk

@b007zk have you tried passing absolute path instead of relative pat in the .env? That is what the readme asks for and what worked for me. LLAMA_EMBEDDINGS_MODEL=/abspath/models/ggml-model-q4_0.bin MODEL_PATH=/abspath/models/ggml-gpt4all-j-v1.3-groovy.bin

Where /abspath/ is the full, absolute path to that file.

And make sure you use =, not : (I run into your same issue when I had a typo there)

smileBeda avatar May 15 '23 06:05 smileBeda

@smileBeda yes I have tried that, no luck unfortunately. I don't think the path is the problem here.

b007zk avatar May 16 '23 03:05 b007zk

@b007zk I had the same exact issue in WSL. Please take a look at #198. I believe it addresses this issue and solves the problem. It introduces a file validator to the ingest.py module using the pathlib PATH module. You can review the changes and verify if they effectively solve your issue.

aHardReset avatar May 16 '23 07:05 aHardReset

@aHardReset Hey, thanks for your comment. I actually tried with your changes but unfortunately still running into the following issue when I run the privateGPT.py

Traceback (most recent call last): File "D:\privateGPT\privateGPT.py", line 57, in main() File "D:\privateGPT\privateGPT.py", line 21, in main embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users<user>\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\huggingface.py", line 44, in init super().init(**kwargs) File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for HuggingFaceEmbeddings model_name none is not an allowed value (type=type_error.none.not_allowed)

b007zk avatar May 18 '23 00:05 b007zk

pip install llama-cpp-python==0.1.48 resolved my issue, along with pip install 'pygpt4all==v1.0.1' --force-reinstall

when using https://huggingface.co/mrgaang/aira/blob/main/gpt4all-converted.bin https://huggingface.co/Pi3141/alpaca-native-7B-ggml/blob/main/ggml-model-q4_0.bin

kuppu avatar May 20 '23 20:05 kuppu

@kuppu thanks but that didn't work for me.

b007zk avatar May 23 '23 14:05 b007zk

Try running Python 3.10.X use pyenv if available on windows so you can switch environments as well as try another embeddings model from huggingface?

BassAzayda avatar May 23 '23 23:05 BassAzayda

cpp is pretty f*'d up so is it possible to just use the koboldcpp.exe file as a server for this? As far as I can tell, there is no reason to have to navigate any dependancies besides langchain and maybe web browsing for this project. IMO (and I am very stupid and unemployed) best practice would be to have the model be hosted by an extension with options as :

GPT API
llama.cpp
localhost
remotehost
and koboldcpp.exe 

and then have

langchain
urllib3
tabulate
tqdm

or whatever as core dependencies. Pytorch is also often an important dependency for llama models to run above 10 t/s, but different GPUs have different CUDA requirements. eg, tesla k80/p40/H100 or GTX660/RTX4090 not to mention AMD

Anonym0us33 avatar May 26 '23 01:05 Anonym0us33