madmads11 comments

Results 9 comments of


                                            madmads11

Draft: Add support for llama.cpp

I see that llama.cpp has added a [C-style API](https://github.com/ggerganov/llama.cpp/pull/370), exciting stuff!

Draft: Add support for llama.cpp

> > I see that llama.cpp has added a [C-style API](https://github.com/ggerganov/llama.cpp/pull/370), exciting stuff! > > Yea. My bindings were based on my own C++ API ([#77](https://github.com/ggerganov/llama.cpp/pull/77) which is now closed)....

Draft: Add support for llama.cpp

Would this support using and interacting with alpaca and llama models of all sizes?

server.py seems to ignore --gpu-memory when --load-in-4bit is specified as an option

I have a 3070 8GB VRAM and 32GB RAM and I am able to load the LLaMA 13b 4-bit model. In chat mode I can run a few messages back...

LLaMA-7B in 8bits CUDA Setup failed despite GPU being available.

I was following the same guide as KamiAso, but I got the same error at the same point on Windows 11. Hopefully a solution to this will come as the...

LLaMA-7B in 8bits CUDA Setup failed despite GPU being available.

> See if this helps: https://www.reddit.com/r/LocalLLaMA/comments/11o6o3f/how_to_install_llama_8bit_and_4bit/ Yes! This works, specifically resuming the tutorial from step 6. > 6. Download [libbitsandbytes_cuda116.dll](https://github.com/DeXtmL/bitsandbytes-win-prebuilt) and put it in C:\Users\xxx\miniconda3\envs\textgen\lib\site-packages\bitsandbytes\ > 7. In \bitsandbytes\cuda_setup\main.py search...

madmads11

Draft: Add support for llama.cpp

Draft: Add support for llama.cpp

Draft: Add support for llama.cpp

server.py seems to ignore --gpu-memory when --load-in-4bit is specified as an option

LLaMA-7B in 8bits CUDA Setup failed despite GPU being available.

LLaMA-7B in 8bits CUDA Setup failed despite GPU being available.

Add proper instructions for using Alpaca models

Add proper instructions for using Alpaca models

Add proper instructions for using Alpaca models