chatglm.cpp icon indicating copy to clipboard operation
chatglm.cpp copied to clipboard

n_gpu_layers 参数?

Open endink opened this issue 1 year ago • 4 comments

不支持 n_gpu_layers 参数控制装载的层数吗?多实例环境对推理速度要求不太高的场合,哪怕每个实例少装载 4~5 层也能节省很多 GPU

endink avatar Aug 17 '23 14:08 endink

In my case it was n-gpu-layers instead of n_gpu_layers which helped me to start https://github.com/oobabooga/text-generation-webui maybe this helps. I'm running the 70B 4 bit quantization.

Tokix avatar Aug 22 '23 21:08 Tokix

@Tokix Thanks, but C++ is important for me 😄

endink avatar Aug 23 '23 04:08 endink

确实有这个需求,我的笔记本3060就差一点点显存,运行不了q4_0的chatglm2-6B

CHNtentes avatar Aug 25 '23 06:08 CHNtentes

需求强烈

wdjwxh avatar Nov 01 '23 08:11 wdjwxh