ragflow
ragflow copied to clipboard
[Bug]: If both gpu and task_executor are enabled at the same time, NCCL Error is generated
Is there an existing issue for the same bug?
- [x] I have checked the existing issues.
RAGFlow workspace code commit ID
4460
RAGFlow image version
no
Other environment information
Actual behavior
When I start the project with ‘docker-compose-gpu. yml’, I also use ’python rag/svr/task_executor.py 1,2,3.... ’. After multiple task actuators are enabled, some task parsing error occurs. The error message is
19:00:41 Page(1~100000001): [ERROR]Fail to bind embedding model: NCCL Error 2: unhandled system error (run with NCCL_DEBUG=INFO for details)
19:00:41 [ERROR][Exception]: NCCL Error 2: unhandled system error (run with NCCL_DEBUG=INFO for details)
Expected behavior
No response
Steps to reproduce
sudo docker compose -f docker-compose-gpu.yml up -d
python rag/svr/task_executor.py 1,2,3....
Additional information
No response