private-gpt
private-gpt copied to clipboard
GPU question
I'm curious to setup this model myself. I have two 3090's and 128 gigs of ram on an i9 all liquid cooled. Would the GPU play any relevance in this or is that only used for training models?
Unfortunately not. The current implementation would work with CPU only. I am trying to make this work on GPU too.
So far, the first few steps I can provide are:
1 - https://github.com/abetlen/llama-cpp-python - Install using this:
$Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"; $Env:FORCE_CMAKE=1; pip3 install llama-cpp-python
Enables the use of CUDA.
2 - https://github.com/hwchase17/langchain - Homebrew the latest commit: Adds the GPU implementation that was introduced to llama.cpp 2 days ago.
3 - Modify the ingest.py
and privateGPT.py
by adding n_gpu_layers=n
argument into LlamaCppEmbeddings
method. Can't test it due to the reason below.
4 - Deal with this error:
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
Update: I have successfully ran the model on my GPU. Planning to drop commits.
@maozdemir please ping me at [email protected] when you drop new commit with GPU support. Thanks :)
I had been working on CUDA support yesterday with no luck. Glad to hear you had some success. Waiting for the commit, as well.
https://github.com/maozdemir/privateGPT-colab/blob/main/privateGPT-colab.ipynb
Set n_gpu_layers=500 for colab in LlamaCpp and LlamaCppEmbeddings functions, also don't use GPT4All, it won't run on GPU.