llama-cpp-python
llama-cpp-python copied to clipboard
llama-server not using GPU
After I install llama-cpp-python-server with cuda support and run
python3 -m llama_cpp.server --model starcoderbase-3b/starcoderbase-3b.Q4_K_M.gguf --n_gpu_layers 10
The GPU is not getting used its running on the CPU