api-for-open-llm TASKS=llm,rag模式下，出现线程问题报错：RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

TASKS=llm,rag模式下，出现线程问题报错：RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Open syusama opened this issue 6 months ago • 0 comments

[X] 请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
[X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案 | I have searched the existing issues / discussions

模型推理和部署 | Model inference and deployment

Linux

Ubuntu系统 docker-compose部署镜像api-llm:vllm

当同时部署llm和embedding模型时

TASKS=llm,rag

会报错： RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

单独部署llm则没有问题

` 微信截图_20240821105553

Aug 21 '24 02:08 syusama