private-gpt GPU question

I'm curious to setup this model myself. I have two 3090's and 128 gigs of ram on an i9 all liquid cooled. Would the GPU play any relevance in this or is that only used for training models?

May 16 '23 04:05 mmsquantum

Unfortunately not. The current implementation would work with CPU only. I am trying to make this work on GPU too.

So far, the first few steps I can provide are: 1 - https://github.com/abetlen/llama-cpp-python - Install using this: $Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"; $Env:FORCE_CMAKE=1; pip3 install llama-cpp-python Enables the use of CUDA. 2 - https://github.com/hwchase17/langchain - Homebrew the latest commit: Adds the GPU implementation that was introduced to llama.cpp 2 days ago. 3 - Modify the ingest.py and privateGPT.py by adding n_gpu_layers=n argument into LlamaCppEmbeddings method. Can't test it due to the reason below. 4 - Deal with this error:

error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model

Update: I have successfully ran the model on my GPU. Planning to drop commits.

May 16 '23 06:05 maozdemir

@maozdemir please ping me at [email protected] when you drop new commit with GPU support. Thanks :)

May 16 '23 10:05 zboinek

I had been working on CUDA support yesterday with no luck. Glad to hear you had some success. Waiting for the commit, as well.

May 16 '23 13:05 eaugustine30

https://github.com/maozdemir/privateGPT-colab/blob/main/privateGPT-colab.ipynb

Set n_gpu_layers=500 for colab in LlamaCpp and LlamaCppEmbeddings functions, also don't use GPT4All, it won't run on GPU.

May 16 '23 14:05 maozdemir

private-gpt private-gpt copied to clipboard

GPU question

private-gpt
private-gpt copied to clipboard