private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

GPU question

Open mmsquantum opened this issue 1 year ago • 4 comments

I'm curious to setup this model myself. I have two 3090's and 128 gigs of ram on an i9 all liquid cooled. Would the GPU play any relevance in this or is that only used for training models?

mmsquantum avatar May 16 '23 04:05 mmsquantum

Unfortunately not. The current implementation would work with CPU only. I am trying to make this work on GPU too.

So far, the first few steps I can provide are: 1 - https://github.com/abetlen/llama-cpp-python - Install using this: $Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"; $Env:FORCE_CMAKE=1; pip3 install llama-cpp-python Enables the use of CUDA. 2 - https://github.com/hwchase17/langchain - Homebrew the latest commit: Adds the GPU implementation that was introduced to llama.cpp 2 days ago. 3 - Modify the ingest.py and privateGPT.py by adding n_gpu_layers=n argument into LlamaCppEmbeddings method. Can't test it due to the reason below. 4 - Deal with this error:

error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model

Update: I have successfully ran the model on my GPU. Planning to drop commits.

maozdemir avatar May 16 '23 06:05 maozdemir

@maozdemir please ping me at [email protected] when you drop new commit with GPU support. Thanks :)

zboinek avatar May 16 '23 10:05 zboinek

I had been working on CUDA support yesterday with no luck. Glad to hear you had some success. Waiting for the commit, as well.

eaugustine30 avatar May 16 '23 13:05 eaugustine30

https://github.com/maozdemir/privateGPT-colab/blob/main/privateGPT-colab.ipynb

Set n_gpu_layers=500 for colab in LlamaCpp and LlamaCppEmbeddings functions, also don't use GPT4All, it won't run on GPU.

maozdemir avatar May 16 '23 14:05 maozdemir