chatglm.cpp n_gpu_layers 参数？

n_gpu_layers 参数？

Open endink opened this issue 1 year ago • 4 comments

不支持 n_gpu_layers 参数控制装载的层数吗？多实例环境对推理速度要求不太高的场合，哪怕每个实例少装载 4~5 层也能节省很多 GPU

Aug 17 '23 14:08 endink

In my case it was n-gpu-layers instead of n_gpu_layers which helped me to start https://github.com/oobabooga/text-generation-webui maybe this helps. I'm running the 70B 4 bit quantization.

Aug 22 '23 21:08 Tokix

@Tokix Thanks, but C++ is important for me 😄

Aug 23 '23 04:08 endink

确实有这个需求，我的笔记本3060就差一点点显存，运行不了q4_0的chatglm2-6B

Aug 25 '23 06:08 CHNtentes

需求强烈

Nov 01 '23 08:11 wdjwxh

chatglm.cpp chatglm.cpp copied to clipboard

n_gpu_layers 参数？

chatglm.cpp
chatglm.cpp copied to clipboard