ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: What type of GPU and how many GPUs are recommended to fasten the speed of parsing files?

Open zengqingfu1442 opened this issue 9 months ago • 27 comments

Describe your problem

I use CPU to parse a PDF file whose size is 55MB, but is cost about 1 hour. It's too slow to create knowledge base in ragflow. What type of GPU are recommended to use to fasten the speed of parsing files? And how many GPUs? Would RTX 4090 or RTX A6000 is recommended? Thanks.

zengqingfu1442 avatar Mar 04 '25 07:03 zengqingfu1442

Please deploy an embedding service with Ollama/Xinference on your GPUs. That's gona accelerate much more.

KevinHuSh avatar Mar 05 '25 02:03 KevinHuSh

Please deploy an embedding service with Ollama/Xinference on your GPUs. That's gona accelerate much more.

I prefer vllm. So you recommend to deploy an embedding model and then add it as embedding model onto ragflow? Which embedding model is recommended?

zengqingfu1442 avatar Mar 05 '25 03:03 zengqingfu1442

Create a knowledge base is not time-consuming; parsing file is. Here are some tips: https://ragflow.io/docs/dev/accelerate_doc_indexing

Or, you can use docker-compose-gpu.yml to start your service. This accelerates DeepDoc tasks using GPU, requiring RAGFlow v0.16.0+.

writinwaters avatar Mar 05 '25 03:03 writinwaters

@zengqingfu1442 Yes. The recommended embedding model is bge-m3.

yuzhichang avatar Mar 05 '25 03:03 yuzhichang

Please deploy an embedding service with Ollama/Xinference on your GPUs. That's gona accelerate much more.

As shown below, this file whose size is only 1MB, is cost 13 minutes to parse it. But i find that OCR part is 3.31s, Layout analysis is 1.80s, embedding is 6.34s. But is waited about 12 minutes for the task to be received(i used the builtin BAAT/bge-large-zh-v1.5 model), i understand this because at that same time a big PDF file about 55M is being parsed on the same machine with CPU. So what i want to express is that the main spending time should not only include embedding, right? Image

zengqingfu1442 avatar Mar 05 '25 03:03 zengqingfu1442

You are right in that embedding is not the only stage that takes time. You toggled on RAPTOR, and this is another time consumer.

BTW, embedding models are for offline processing, it does not contribute to the performance of question answering, which is in real-time.

writinwaters avatar Mar 05 '25 03:03 writinwaters

So the OCR runs on CPU by default?

zengqingfu1442 avatar Mar 05 '25 03:03 zengqingfu1442

Yes, you are right.

writinwaters avatar Mar 05 '25 03:03 writinwaters

Create a knowledge base is not time-consuming; parsing file is. Here are some tips: https://ragflow.io/docs/dev/accelerate_doc_indexing

Or, you can use docker-compose-gpu.yml to start your service. This accelerates DeepDoc tasks using GPU, requiring RAGFlow v0.16.0+.

I compiled a Docker image from the source code to start. If OCR uses GPU, LIGHT=0 is required. Docker-compose-gpu.yml does indeed use GPU to accelerate ORC, but there will be OOM issues, and in the case of multiple graphics cards, only the first graphics card will be used. Can you solve this problem? Thank you.

said-what-sakula avatar Mar 05 '25 06:03 said-what-sakula

@said-what-sakula onnxruntime-gpu leaking memory is a known issue. We've add code according to that onnxruntime issue but it doesn't help. v0.17.0 add Support the use of LLM to parse documents(experimental).

yuzhichang avatar Mar 05 '25 06:03 yuzhichang

Can RAPTOR and Extracting knowledge graph (GraphRAG) also run on GPU to accelerate?

You are right in that embedding is not the only stage that takes time. You toggled on RAPTOR, and this is another time consumer.

BTW, embedding models are for offline processing, it does not contribute to the performance of question answering, which is in real-time.

zengqingfu1442 avatar Mar 05 '25 06:03 zengqingfu1442

Thanks @yuzhichang for sharing the story behind this. @said-what-sakula You can take a look at Feature 5 of RAGFlow's latest release notes: https://ragflow.io/docs/dev/release_notes#v0170

writinwaters avatar Mar 05 '25 06:03 writinwaters

Can RAPTOR and Extracting knowledge graph (GraphRAG) also run on GPU to accelerate?

You are right in that embedding is not the only stage that takes time. You toggled on RAPTOR, and this is another time consumer. BTW, embedding models are for offline processing, it does not contribute to the performance of question answering, which is in real-time.

GPU can't be used to accelerate RAPTOR and knowledge graph.

writinwaters avatar Mar 05 '25 06:03 writinwaters

Thanks @yuzhichang for sharing the story behind this. @said-what-sakula You can take a look at Feature 5 of RAGFlow's latest release notes: https://ragflow.io/docs/dev/release_notes#v0170

I didn't expect the update to be so fast. I will go check the update document and try it out

said-what-sakula avatar Mar 05 '25 06:03 said-what-sakula

Image

It seems that the graph in PDF file can not be parsed.

zengqingfu1442 avatar Mar 05 '25 11:03 zengqingfu1442

Could you use the keywords in the diagram to search for the related chunk?

writinwaters avatar Mar 06 '25 08:03 writinwaters

Perhaps i should choose "Book" Chunk method to parse the PDF file because it is an e-book.

zengqingfu1442 avatar Mar 06 '25 09:03 zengqingfu1442

Perhaps i should choose "Book" Chunk method to parse the PDF file because it is an e-book.

You don't have to change chunk method. Check if the diagram is in the created chunk first.

writinwaters avatar Mar 06 '25 09:03 writinwaters

请问您的pdf解析速度如何?使用GPU加速了吗

Alisehen avatar Mar 06 '25 13:03 Alisehen

我使用官方代码解析pdf的时候发现也是无法使用的而且对传输 文件还有大小限制 ,请问如何去除文件大小影响,超过100的pdf扫描件就无法上传

tongchangD avatar Mar 10 '25 06:03 tongchangD

You need to alter nginx configurations.

Image

KevinHuSh avatar Mar 10 '25 07:03 KevinHuSh

@KevinHuSh thanks, I tried DOC_MAXIMUM_SIZE,but it didn't work. I'm going to try your scheme

tongchangD avatar Mar 10 '25 08:03 tongchangD

@KevinHuSh It also doesn't work

Image

and error

2025-03-10 17:17:53,496 ERROR    16 413 Request Entity Too Large: The data value transmitted exceeds the capacity limit.
Traceback (most recent call last):
  File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/ragflow/.venv/lib/python3.10/site-packages/flask_login/utils.py", line 290, in decorated_view
    return current_app.ensure_sync(func)(*args, **kwargs)
  File "/ragflow/api/utils/api_utils.py", line 145, in decorated_function
    input_arguments = flask_request.json or flask_request.form.to_dict()
  File "/ragflow/api/apps/__init__.py", line 40, in <lambda>
    Request.json = property(lambda self: self.get_json(force=True, silent=True))
  File "/ragflow/.venv/lib/python3.10/site-packages/werkzeug/wrappers/request.py", line 605, in get_json
    data = self.get_data(cache=cache)
  File "/ragflow/.venv/lib/python3.10/site-packages/werkzeug/wrappers/request.py", line 419, in get_data
    rv = self.stream.read()
  File "/ragflow/.venv/lib/python3.10/site-packages/werkzeug/utils.py", line 107, in __get__
    value = self.fget(obj)  # type: ignore
  File "/ragflow/.venv/lib/python3.10/site-packages/werkzeug/wrappers/request.py", line 348, in stream
    return get_input_stream(
  File "/ragflow/.venv/lib/python3.10/site-packages/werkzeug/wsgi.py", line 173, in get_input_stream
    raise RequestEntityTooLarge()
werkzeug.exceptions.RequestEntityTooLarge: 413 Request Entity Too Large: The data value transmitted exceeds the capacity limit.

tongchangD avatar Mar 10 '25 09:03 tongchangD

export MAX_CONTENT_LENGTH=100000000000

KevinHuSh avatar Mar 11 '25 05:03 KevinHuSh

export MAX_CONTENT_LENGTH=100000000000

in .env file?

zengqingfu1442 avatar Mar 11 '25 08:03 zengqingfu1442

Yes.

KevinHuSh avatar Mar 12 '25 04:03 KevinHuSh

Yes.

请问修改之后需要重新 docker compose 吗 主要是 ragflow-server 还是 ragflow-mysql ragflow-minio ,重新创建docker 会不会损失 原始做好的数据库

tongchangD avatar Mar 14 '25 03:03 tongchangD