ragflow
ragflow copied to clipboard
[Question]: Parsing time is so long
Describe your problem
Thanks for your work. I have deploy the ragflow system in my own server.
However, when I upload pdf file (2 pages), it costs long time to parse it (more than 300 seconds ).
log for file 1
流程开始于:
Tue, 16 Apr 2024 13:52:45 GMT
过程持续时间:
385.359
进度消息:
Page(1~2): OCR is running...
Page(1~2): OCR finished
Page(1~2): Layout analysis finished.
Page(1~2): Table analysis finished.
Page(1~2): Text merging finished
Page(1~2): Finished slicing files(3). Start to embedding the content.
Page(1~2): Finished embedding! Start to build index!
Page(1~2): Done!
log for file 2
流程开始于:
Tue, 16 Apr 2024 14:08:13 GMT
过程持续时间:
771.436
进度消息:
Page(1~2): OCR is running...
Page(1~2): OCR finished
Page(1~2): Layout analysis finished.
Page(1~2): Table analysis finished.
Page(1~2): Text merging finished
Page(1~2): Finished slicing files(3). Start to embedding the content.
Page(1~2): Finished embedding! Start to build index!
Page(1~2): Done!
You can try calling GPU resources for parsing. According to the process of Docker deployment, GPU resources are not called by default. Here, by checking "docker/docker-compose. yml" and "docker/docker-compose-cn. yml", it can be seen that there is no configuration related to GPU during Docker container creation.
You just need to stop and delete the relevant containers that have already been started, add the following configuration in these two folders, and re execute the Docker Compose. When parsing again, you will find that the speed will be much faster after calling the GPU.
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
Such as docker-compose. yml, complete as follows:
version: '2.2'
include:
- path: ./docker-compose-base.yml
env_file: ./.env
services:
ragflow:
depends_on:
mysql:
condition: service_healthy
es01:
condition: service_healthy
image: infiniflow/ragflow:v1.0
container_name: ragflow-server
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
ports:
- ${SVR_HTTP_PORT}:9380
- 80:80
- 443:443
volumes:
- ./service_conf.yaml:/ragflow/conf/service_conf.yaml
- ./entrypoint.sh:/ragflow/entrypoint.sh
- ./ragflow-logs:/ragflow/logs
- ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf
- ./nginx/proxy.conf:/etc/nginx/proxy.conf
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
environment:
- TZ=${TIMEZONE}
networks:
- ragflow
restart: always
yes it tooks long time. worth it.
You can try calling GPU resources for parsing. According to the process of Docker deployment, GPU resources are not called by default. Here, by checking "docker/docker-compose. yml" and "docker/docker-compose-cn. yml", it can be seen that there is no configuration related to GPU during Docker container creation.
You just need to stop and delete the relevant containers that have already been started, add the following configuration in these two folders, and re execute the Docker Compose. When parsing again, you will find that the speed will be much faster after calling the GPU.
deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu]
Such as docker-compose. yml, complete as follows:
version: '2.2' include: - path: ./docker-compose-base.yml env_file: ./.env services: ragflow: depends_on: mysql: condition: service_healthy es01: condition: service_healthy image: infiniflow/ragflow:v1.0 container_name: ragflow-server deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu] ports: - ${SVR_HTTP_PORT}:9380 - 80:80 - 443:443 volumes: - ./service_conf.yaml:/ragflow/conf/service_conf.yaml - ./entrypoint.sh:/ragflow/entrypoint.sh - ./ragflow-logs:/ragflow/logs - ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf - ./nginx/proxy.conf:/etc/nginx/proxy.conf - ./nginx/nginx.conf:/etc/nginx/nginx.conf environment: - TZ=${TIMEZONE} networks: - ragflow restart: always
I have tried to add the configuration of gpu, however it doesn't work. Is a specific version of cuda or nvidia-driver required.
The cuda version I used
NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5
You can try calling GPU resources for parsing. According to the process of Docker deployment, GPU resources are not called by default. Here, by checking "docker/docker-compose. yml" and "docker/docker-compose-cn. yml", it can be seen that there is no configuration related to GPU during Docker container creation. You just need to stop and delete the relevant containers that have already been started, add the following configuration in these two folders, and re execute the Docker Compose. When parsing again, you will find that the speed will be much faster after calling the GPU.
deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu]
Such as docker-compose. yml, complete as follows:
version: '2.2' include: - path: ./docker-compose-base.yml env_file: ./.env services: ragflow: depends_on: mysql: condition: service_healthy es01: condition: service_healthy image: infiniflow/ragflow:v1.0 container_name: ragflow-server deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu] ports: - ${SVR_HTTP_PORT}:9380 - 80:80 - 443:443 volumes: - ./service_conf.yaml:/ragflow/conf/service_conf.yaml - ./entrypoint.sh:/ragflow/entrypoint.sh - ./ragflow-logs:/ragflow/logs - ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf - ./nginx/proxy.conf:/etc/nginx/proxy.conf - ./nginx/nginx.conf:/etc/nginx/nginx.conf environment: - TZ=${TIMEZONE} networks: - ragflow restart: always
I have tried to add the configuration of gpu, however it doesn't work. Is a specific version of cuda or nvidia-driver required. The cuda version I used
NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5
This is the version I am using, verifying that GPU parsing can be called normally.
Docker Compose :v2.21.0 Nvidia Driver :524.147.05 CUDA Version :12.0
You can check the Docker Compose version, as the GPU mounting configuration for different versions of Docker Compose may vary.
working perfectly on driver 525.125.06 - cuda 12.0 ragflow v2.0 6.0 - build testing on 2024-05-22 16:30 GMT -3 AMERICA_SAO_PAULO_BR
You can try calling GPU resources for parsing. According to the process of Docker deployment, GPU resources are not called by default. Here, by checking "docker/docker-compose. yml" and "docker/docker-compose-cn. yml", it can be seen that there is no configuration related to GPU during Docker container creation. You just need to stop and delete the relevant containers that have already been started, add the following configuration in these two folders, and re execute the Docker Compose. When parsing again, you will find that the speed will be much faster after calling the GPU.
deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu]
Such as docker-compose. yml, complete as follows:
version: '2.2' include: - path: ./docker-compose-base.yml env_file: ./.env services: ragflow: depends_on: mysql: condition: service_healthy es01: condition: service_healthy image: infiniflow/ragflow:v1.0 container_name: ragflow-server deploy: resources: reservations: devices: - driver: nvidia device_ids: ['0'] capabilities: [gpu] ports: - ${SVR_HTTP_PORT}:9380 - 80:80 - 443:443 volumes: - ./service_conf.yaml:/ragflow/conf/service_conf.yaml - ./entrypoint.sh:/ragflow/entrypoint.sh - ./ragflow-logs:/ragflow/logs - ./nginx/ragflow.conf:/etc/nginx/conf.d/ragflow.conf - ./nginx/proxy.conf:/etc/nginx/proxy.conf - ./nginx/nginx.conf:/etc/nginx/nginx.conf environment: - TZ=${TIMEZONE} networks: - ragflow restart: always
I have tried to add the configuration of gpu, however it doesn't work. Is a specific version of cuda or nvidia-driver required. The cuda version I used
NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5
You need CUDA 12, and minimum hardware for CUDA 12 is the generation of GTX 980 I believe.