private-gpt Any way can get GPU work?

Can anyone suggest how to make GPU work with this project?

May 12 '23 00:05 rexzhang2023

Chances are, it's already partially using the GPU. As it is now, it's a script linking together LLaMa.cpp emeddings, Chroma vector DB, and GPT4All. GPT4All might be using PyTorch with GPU, Chroma is probably already heavily CPU parallelized, and LLaMa.cpp runs only on the CPU.

It's also worth noting that two LLMs are used with different inference implementations, meaning you may have to load the model twice.

May 12 '23 08:05 walking-octopus

I watched my GPU usage and it was not touched.

May 12 '23 13:05 mmike87

this mean that this work only with CPU?

I currently want to try this

Also can give some info on the Readme about the requirements of hardware.

May 13 '23 02:05 pabl-o-ce

No, LlamaCpp was designed to take only CPU resources. For GPU you'd have to use the native Llama model from facebook.

May 14 '23 06:05 su77ungr

I can get it work in Ubuntu 22.04 installing llama-cpp-python with cuBLAS: CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.48

If installation fails because it doesn't find CUDA, it's probably because you have to include CUDA install path to PATH environment variable: export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}

Anyways, it only uses lesst than 1 GB of the VRAM on a RTX 2060 with 6 GB, so I don't know if something is still missing.

May 14 '23 15:05 iker-lluvia

Aren't you just emulating the CPU? Idk if there's even working port for GPU support

May 14 '23 20:05 su77ungr

Aren't you just emulating the CPU? Idk if there's even working port for GPU support

It shouldn't. The llama.cpp library can perform BLAS acceleration using the CUDA cores of the Nvidia GPU through cuBLAS. I expect llama-cpp-python to do so as well when installing it with cuBLAS. Any fast way to verify if the GPU is being used other than running nvidia-smi or nvtop?

May 15 '23 07:05 iker-lluvia

Nvm my collaborator found a way see

May 15 '23 09:05 su77ungr

If anyone can't still figure this out, I explained how I got it to work in detail here (issue #217)

May 22 '23 14:05 shondle

private-gpt private-gpt copied to clipboard

Any way can get GPU work?

private-gpt
private-gpt copied to clipboard