Langchain-Chatchat 如何改成多卡推理？

如何改成多卡推理？

Open nameless0704 opened this issue 1 year ago • 5 comments

Apr 13 '23 08:04 nameless0704

Apr 14 '23 03:04 qianchen94

可以考虑参考chatglm-6b项目的issue中关于使用accelerate进行多卡推理的内容

qianchen94 @.***>于2023年4月14日周五11:30写道：

+1

— Reply to this email directly, view it on GitHub https://github.com/imClumsyPanda/langchain-ChatGLM/issues/77#issuecomment-1507878465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLH5EWM2LKEOHJMQ4MO7EDXBDAG5ANCNFSM6AAAAAAW4YG4HM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Apr 14 '23 03:04 imClumsyPanda

改成单机多卡推理，AutoModel里加上device_map='auto'之后会报错：tensor不在同一个device上，但是sentencetransformers里读的embeddings和langchain UnstructedFileLoader好像（暂时）都没法multigpu……所以根本不能在一个device？

Apr 14 '23 06:04 nameless0704

你好，你最后有实现多卡推理吗？我是一台机器有4个3080，一个卡有点不ok，想4个卡一起推理。