LLamaSharp about NVidia GPU use example

about NVidia GPU use example

Open CrazyJson opened this issue 11 months ago • 7 comments

I have an RTX 4060 graphics card, how do I deploy a gpu version of the model with this project

Mar 18 '24 11:03 CrazyJson

You need a gguf model file to use llama.cpp, not safetensors.

Mar 18 '24 14:03 martindevans

Thanks, I understand llama.cpp is used to load the quantized gguf model. One more question, which parameter in the sample code is used to enable the use of the local GPU, and how to choose which local gpu to use

Mar 19 '24 01:03 CrazyJson

@CrazyJson , You need install coda in your pc if you install cuda11, you should be choose this package cuda12 use this package

I'm not sure if OpenCL supports Intel graphics cards

Mar 19 '24 14:03 ChengYen-Tang

i have the same problem, i download and install Cuda12 , but not use my GPU still use RAM!

Aug 10 '24 13:08 ZCOREP

do you have the cuda toolkit installed? You need that to supply the CUDA runtime packages

Aug 10 '24 15:08 martindevans

yes i did

Aug 10 '24 17:08 ZCOREP

LLamaSharp LLamaSharp copied to clipboard

about NVidia GPU use example

LLamaSharp
LLamaSharp copied to clipboard