Langchain-Chatchat 求教怎样多进程或多线程启动AI服务接口

求教怎样多进程或多线程启动AI服务接口

Open Longleaves opened this issue 7 months ago • 47 comments

发现python starup.py -a后，当多个人同时提问AI模型时，输出会卡顿，速度较慢。运行的是13b模型。硬件占用情况： 1.显卡占用为一半不到（2张48G显卡，分别只占用了20G不到） 2.cpu占用率仅为2个核（220%），现在cpu为16核，怎么利用上其他核？

想求教各位大佬，怎样可以多进程或多线程启动AI服务接口，应该是端口7861。是应该改写starup.py中的run_model_worker的uvicorn.run(app, host=host, port=port, log_level=log_level.lower()) 吗？ starup.py参数改了如下部分： args.gpus = "0,1" # GPU的编号,如果有多个GPU，可以设置为"0,1,2,3" args.max_gpu_memory = "40GiB" args.num_gpus = 2 # model worker的切分是model并行，这里填写显卡的数量

我才疏学浅，真挚请教各位，谢谢！

Jul 18 '24 07:07 Longleaves

Langchain-Chatchat Langchain-Chatchat copied to clipboard

求教怎样多进程或多线程启动AI服务接口

Langchain-Chatchat
Langchain-Chatchat copied to clipboard