LLamaSharp icon indicating copy to clipboard operation
LLamaSharp copied to clipboard

about NVidia GPU use example

Open CrazyJson opened this issue 11 months ago • 7 comments

I have an RTX 4060 graphics card, how do I deploy a gpu version of the model with this project

CrazyJson avatar Mar 18 '24 11:03 CrazyJson

image image

CrazyJson avatar Mar 18 '24 11:03 CrazyJson

You need a gguf model file to use llama.cpp, not safetensors.

martindevans avatar Mar 18 '24 14:03 martindevans

image Thanks, I understand llama.cpp is used to load the quantized gguf model. One more question, which parameter in the sample code is used to enable the use of the local GPU, and how to choose which local gpu to use

CrazyJson avatar Mar 19 '24 01:03 CrazyJson

@CrazyJson , You need install coda in your pc if you install cuda11, you should be choose this package cuda12 use this package

I'm not sure if OpenCL supports Intel graphics cards

ChengYen-Tang avatar Mar 19 '24 14:03 ChengYen-Tang

i have the same problem, i download and install Cuda12 , but not use my GPU still use RAM!

ZCOREP avatar Aug 10 '24 13:08 ZCOREP

do you have the cuda toolkit installed? You need that to supply the CUDA runtime packages

martindevans avatar Aug 10 '24 15:08 martindevans

yes i did

ZCOREP avatar Aug 10 '24 17:08 ZCOREP