StephenDWright
StephenDWright
> @StephenDWright I also tried to enable GPU acceleration, but it was not successful. Can you give me more details? Thank you. I followed the instructs in the thread. in...
> @StephenDWright I didn't encounter any errors, but the GPU just doesn't work, and I feel like my CUDA toolkit hasn't been fixed properly :( > What kind of GPU...
> @StephenDWright thanks for the follow up. I was wondering what type of graphic card would I need to make this somehow usable. I don't think my GPU would handle...
You can get faster inference using GPU offloading. llama.cpp now supports a mix of using CPU and GPU. Even langchain mentions it. That will significantly speed up inference.
> > You can get faster inference using GPU offloading. llama.cpp now supports a mix of using CPU and GPU. Even langchain mentions it. That will significantly speed up inference....