ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

dataset parsing task has been in queued for 600s+

Open jiangsanyin opened this issue 11 months ago • 14 comments

Describe your problem

I deployed ragflow 0.15.1 with ragflow/docker/docker-compose.yml in physical machine which Ubuntu20.04 LTS OS was installed, and this machine has 2 NVIDIA A40 GPU.

  1. I created a file parsing task with file "食品安全法.pdf", and it worked well.
  2. Then I directly redeployed ragflow 0.15.1 by running "docker-compose -f docker-compose-gpu.yml up -d" , and all docker containers is fine:

Image 3) Now I try to create a file parsing task with file "乳腺癌相关说明(作为知识库).pdf", however this task is always in queued Image

How do I look into the logs and locate this problem?

jiangsanyin avatar Feb 06 '25 08:02 jiangsanyin

You can check logs of docker container to see if there's anything weird. Example:

docker logs -f ragflow-server

EBazarov avatar Feb 06 '25 12:02 EBazarov

同样的问题,解决了嘛

shinegob avatar Feb 08 '25 05:02 shinegob

You can check logs of docker container to see if there's anything weird. Example:

docker logs -f ragflow-server

Thanks for your reply. I checked it but found nothing abnormal

jiangsanyin avatar Feb 08 '25 06:02 jiangsanyin

同样的问题,解决了嘛

我将默认的使用all GPUs改成使用1个GPU,然后就没有问题了。(I used a GPU instead and it works fine then)

jiangsanyin avatar Feb 08 '25 06:02 jiangsanyin

同样的问题,解决了嘛

我将默认使用的all GPUs改成使用1个GPU,然后就没有问题了。(我用了 GPU,然后它工作正常)

请问这个从哪里改,我和你遇到的问题一样,任务总是在排队中

yyx002 avatar Feb 10 '25 07:02 yyx002

同样的问题,解决了嘛

我将默认使用的all GPUs改成使用1个GPU,然后就没有问题了。(我用了 GPU,然后它工作正常)

请问这个从哪里改,我和你遇到的问题一样,任务总是在排队中

修改docker-compose-gpu.yml 文件的倒数第2行的内容 root@controller01:/opt/code_repos/ragflow/docker# cat docker-compose-gpu.yml ... deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu]

jiangsanyin avatar Feb 10 '25 07:02 jiangsanyin

感谢感谢,我的也成功了,你知道这是为啥嘛,求教

yyx002 avatar Feb 10 '25 07:02 yyx002

感谢,尝试了你的方法,好像确实暂时没有出现全部在排队的情况。目前有个新问题,开启raptor后,除了个别docx文档解析成功,绝大多数斗出现The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()的报错问题,这个请问你出现了吗

发件人:yyx002 @.> 发送时间:2025年2月10日(星期一) 15:10 @.> @.>; @.> 主 题:Re: [infiniflow/ragflow] dataset parsing task has been in queued for 600s+ (Issue #4746) 同样的问题,解决了嘛 我将默认使用的all GPUs改成使用1个GPU,然后就没有问题了。(我用了 GPU,然后它工作正常) 请问这个从哪里改,我和你遇到的问题一样,任务总是在排队中 — Reply to this email directly, view it on GitHub <https://github.com/infiniflow/ragflow/issues/4746#issuecomment-2647119228 >, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKQMSBGZJ3MQQGUE6MUNTF32PBGG5AVCNFSM6AAAAABWS7M6ICVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBXGEYTSMRSHA >. You are receiving this because you commented.Message ID: @.***>

shinegob avatar Feb 10 '25 08:02 shinegob

感谢,尝试了你的方法,好像确实暂时没有出现全部在排队的情况。目前有个新问题,开启raptor后,除了个别docx文档解析成功,绝大多数斗出现The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()的报错问题,这个请问你出现了吗

发件人:jiangsanyin @.> 发送时间:2025年2月10日(星期一) 15:13 @.> @.>; @.> 主 题:Re: [infiniflow/ragflow] dataset parsing task has been in queued for 600s+ (Issue #4746) 同样的问题,解决了嘛 我将默认使用的all GPUs改成使用1个GPU,然后就没有问题了。(我用了 GPU,然后它工作正常) 请问这个从哪里改,我和你遇到的问题一样,任务总是在排队中 @.***:/opt/code_repos/ragflow/docker# cat docker-compose-gpu.yml However, you are welcome to file a pull request to improve it. include:

./docker-compose-base.yml services: ragflow: depends_on: mysql: condition: service_healthy image: ${RAGFLOW_IMAGE} container_name: ragflow-server ports:

  • ${SVR_HTTP_PORT}:9380
  • 80:80
  • 444:443 volumes:
  • ./ragflow-logs:/ragflow/logs
  • ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf
  • ./nginx/proxy.conf:/etc/nginx/proxy.conf
  • ./nginx/nginx.conf:/etc/nginx/nginx.conf env_file: .env environment:
  • TZ=${TIMEZONE}
  • HF_ENDPOINT=${HF_ENDPOINT}
  • MACOS=${MACOS} networks:
  • ragflow restart: on-failure

https://docs.docker.com/engine/daemon/prometheus/#create-a-prometheus-configuration <https://docs.docker.com/engine/daemon/prometheus/#create-a-prometheus-configuration >

If you're using Docker Desktop, the --add-host flag is optional. This flag makes sure that the host's internal IP gets exposed to the Prometheus container.

extra_hosts:

  • "host.docker.internal:host-gateway" deploy: resources: reservations: devices:
  • driver: nvidia count: 1 ##这里从all改为1 capabilities: [gpu] — Reply to this email directly, view it on GitHub <https://github.com/infiniflow/ragflow/issues/4746#issuecomment-2647123281 >, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKQMSBGDDMIOMBPXQ4LS3YD2PBGRRAVCNFSM6AAAAABWS7M6ICVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNBXGEZDGMRYGE >. You are receiving this because you commented.Message ID: @.***>

shinegob avatar Feb 10 '25 08:02 shinegob

The web demo version is also always queued and shows the following: Task is queued...

wyqmath avatar Feb 10 '25 18:02 wyqmath

我这通过docker-compose-gpu.yml 启动的镜像,启动过后通过nvidia-smi监控,发现ragflow应用没用到gpu样。

Image 我的docker-compose-gpu.yml配置如下

Image 请问是我开启GPU有问题,还是镜像不是GPU版本的还是?

lizhao-8202 avatar Feb 12 '25 13:02 lizhao-8202

目前在做解析的时候发现非常慢,很小的1 个文档都要5,60S。.env配置文件中的MEM_LIMIT已经配置到64G了。机器的CPU是32C的,应该不至于这么慢吧

lizhao-8202 avatar Feb 12 '25 13:02 lizhao-8202

我这通过docker-compose-gpu.yml 启动的镜像,启动过后通过nvidia-smi监控,发现ragflow应用没用到gpu样。

Image 我的docker-compose-gpu.yml配置如下

Image 请问是我开启GPU有问题,还是镜像不是GPU版本的还是?

我刚刚也试了下,发现在解析文件过程中,宿主机上没有查看到使用nvidia GPU的进程,而且ragflow-server容器中执行“nvidia-smi”报错“Failed to initialize NVML: Unknown Error”。

感觉目前ragflow 0.15.1使用NVIDIA GPU时有问题

jiangsanyin avatar Feb 13 '25 01:02 jiangsanyin

Thanks for sharing the solution!!❤

Cedar0607 avatar Mar 10 '25 13:03 Cedar0607