Llama-2-Open-Source-LLM-CPU-Inference icon indicating copy to clipboard operation
Llama-2-Open-Source-LLM-CPU-Inference copied to clipboard

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Results 16 Llama-2-Open-Source-LLM-CPU-Inference issues
Sort by recently updated
recently updated
newest added

File "/home/void/.local/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 261, in hf_raise_for_status response.raise_for_status() File "/home/void/.local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/models/llama-2-7b-chat.ggmlv3.q8_0.bin/revision/main

Hi, Thanks for the great article. I've a question here. When executing dbqa(), it returned: free(): invalid next size (normal) Aborted (core dumped) Any idea? == Here are the steps...

**You can use the 70b parameter model now as well, here is how I accomplished it:** 1. Downloaded the 70b parameter model I wanted from [https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main). In my case, I...

Is there a way to pass a custom system prompt with the query?

I was intrigued by your medium article, so I downloaded and ran this repo, and the following error appears. "error loading model: unrecognized tensor type 11" Please help. And there...

response time of more than 80 seconds for 20 page single document