ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: If both gpu and task_executor are enabled at the same time, NCCL Error is generated

Open iot-future opened this issue 10 months ago • 0 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

RAGFlow workspace code commit ID

4460

RAGFlow image version

no

Other environment information


Actual behavior

When I start the project with ‘docker-compose-gpu. yml’, I also use ’python rag/svr/task_executor.py 1,2,3.... ’. After multiple task actuators are enabled, some task parsing error occurs. The error message is

19:00:41 Page(1~100000001): [ERROR]Fail to bind embedding model: NCCL Error 2: unhandled system error (run with NCCL_DEBUG=INFO for details)
19:00:41 [ERROR][Exception]: NCCL Error 2: unhandled system error (run with NCCL_DEBUG=INFO for details)

Expected behavior

No response

Steps to reproduce

sudo docker compose -f docker-compose-gpu.yml up -d
python rag/svr/task_executor.py 1,2,3....

Additional information

No response

iot-future avatar Mar 09 '25 12:03 iot-future