localllm
localllm copied to clipboard
Option to enable the GPU
Hi All,
First of all thank you for this excellent tool which makes it very to easy to run the LLM models without any hassle.
I am aware that the main purpose of the localllm is to eliminate the dependency on GPUs and run the models using CPU. However I wanted to know if there is an option to offload the layers to the GPU.
Machine : Compute engine created in GCP
OS : Ubuntu 22.04 LTS
GPU : Tesla T4
Steps I followed thus far is as given below:
- Installed the Nvidia driver in the compute engine.
nvidia-smioutput as given below
- Assuming localllm does not directly provide an option to enable GPU ( I may be wrong here), I clone the llama-cpp-python repository, and updated the
n_gpu_layersto 4 inllama_cpp/server/settings.py. - Built the package by running
pip install -e ., complete step is given here - Killed the localllm and started again.
However I still see that the GPUs are not being utilized.
Are the above steps correct or did I miss anything here?
Thank you,
KK
Thanks for the question @kk2491, I'll take a look and see what I can find!