localGPT icon indicating copy to clipboard operation
localGPT copied to clipboard

Unable to load llama model from path

Open shibbycribby opened this issue 1 year ago • 2 comments

I've been getting a repeated error when trying to run localgpt. I got everything working under cpu but the performance was pretty slow so I rebuilt using the latest cuda version of pytorch and rebuilt the llama wheel but now I can ingest the test document ok but when I go to run I get the following issue every time:

(localgpt) PS J:\localgpt\localGPT> python run_localGPT.py 2024-01-31 13:41:26,923 - INFO - run_localGPT.py:241 - Running on: cuda 2024-01-31 13:41:26,923 - INFO - run_localGPT.py:242 - Display Source Documents set to: False 2024-01-31 13:41:26,923 - INFO - run_localGPT.py:243 - Use history set to: False 2024-01-31 13:41:27,344 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large load INSTRUCTOR_Transformer C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\torch_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)() max_seq_length 512 2024-01-31 13:41:28,335 - INFO - run_localGPT.py:59 - Loading Model: TheBloke/Llama-2-7b-Chat-GGUF, on: cuda 2024-01-31 13:41:28,335 - INFO - run_localGPT.py:60 - This action can take a few minutes! 2024-01-31 13:41:28,335 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models Traceback (most recent call last): File "J:\localgpt\localGPT\run_localGPT.py", line 282, in main() File "C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) File "C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) File "C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) File "J:\localgpt\localGPT\run_localGPT.py", line 249, in main qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type=model_type) File "J:\localgpt\localGPT\run_localGPT.py", line 138, in retrieval_qa_pipline llm = load_model(device_type, model_id=MODEL_ID, model_basename=MODEL_BASENAME, LOGGING=logging) File "J:\localgpt\localGPT\run_localGPT.py", line 64, in load_model llm = load_quantized_model_gguf_ggml(model_id, model_basename, device_type, LOGGING) File "J:\localgpt\localGPT\load_models.py", line 56, in load_quantized_model_gguf_ggml return LlamaCpp(**kwargs) File "C:\ProgramData\anaconda3\envs\localgpt\lib\site-packages\langchain\load\serializable.py", line 74, in init super().init(**kwargs) File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for LlamaCpp root Could not load Llama model from path: ./models\models--TheBloke--Llama-2-7b-Chat-GGUF\snapshots\191239b3e26b2882fb562ffccdd1cf0f65402adb\llama-2-7b-chat.Q4_K_M.gguf. Received error [WinError 2] The system cannot find the file specified: 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\bin' (type=value_error)

I have checked my environment and can see the model folder has a symlink file to the model in the location that is listed in the error so assume that is ok and my pc paths don't seem to show anything mentioning the cuda folders with the repeated bin\bin so I'm not quite sure where its reading this from to get these errors, does anyone have any ideas?

shibbycribby avatar Jan 31 '24 14:01 shibbycribby

How did you install llama-cpp-python? Make sure you install the version specified in the readme:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

PromtEngineer avatar Feb 05 '24 05:02 PromtEngineer

@PromtEngineer CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

what if we don't Install the specified version what will happen?

NitkarshChourasia avatar Apr 06 '24 19:04 NitkarshChourasia