ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: In v0.19.0, In addition to the General chunking method, other methods(like Presentation) still cannot be parsed using the VLM model, and this problem does not seem to be solved.

Open Dreamcatcher-wind opened this issue 7 months ago • 2 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

Actually, #8109 still hasn't been resolved.

Here is Log:

2025-06-11 23:13:04,980 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:04] "GET /v1/user/info HTTP/1.1" 200 - 2025-06-11 23:13:04,986 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:04] "GET /v1/tenant/list HTTP/1.1" 200 - 2025-06-11 23:13:04,988 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:04] "GET /v1/user/tenant_info HTTP/1.1" 200 - 2025-06-11 23:13:04,996 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:04] "GET /v1/kb/detail?kb_id=5c23770246ce11f08b8b0242ac180006 HTTP/1.1" 200 - 2025-06-11 23:13:04,997 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:04] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:04,997 INFO 18 HEAD http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006 [status:200 duration:0.003s] 2025-06-11 23:13:05,026 INFO 18 POST http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_search [status:200 duration:0.005s] 2025-06-11 23:13:05,027 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:05] "GET /v1/kb/5c23770246ce11f08b8b0242ac180006/knowledge_graph HTTP/1.1" 200 - 2025-06-11 23:13:06,139 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:06] "GET /v1/kb/detail?kb_id=5c23770246ce11f08b8b0242ac180006 HTTP/1.1" 200 - 2025-06-11 23:13:06,154 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:06] "GET /v1/user/tenant_info HTTP/1.1" 200 - 2025-06-11 23:13:06,225 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:06] "GET /v1/llm/list HTTP/1.1" 200 - 2025-06-11 23:13:06,229 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:06] "POST /v1/kb/list HTTP/1.1" 200 - 2025-06-11 23:13:08,158 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:08] "GET /v1/user/tenant_info HTTP/1.1" 200 - 2025-06-11 23:13:08,163 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:08] "GET /v1/kb/detail?kb_id=5c23770246ce11f08b8b0242ac180006 HTTP/1.1" 200 - 2025-06-11 23:13:08,163 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:08] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:15,208 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "POST /v1/document/upload HTTP/1.1" 200 - 2025-06-11 23:13:15,232 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:15,233 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:15,253 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "POST /v1/document/run HTTP/1.1" 200 - 2025-06-11 23:13:15,287 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "GET /v1/document/image/5c23770246ce11f08b8b0242ac180006-thumbnail_98f3019a46d611f090cc0242ac180006.png HTTP/1.1" 200 - 2025-06-11 23:13:15,300 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:15] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:19,496 INFO 31 handle_task begin for task {"id": "99020a8246d611f0aba00242ac180006", "doc_id": "98f3019a46d611f090cc0242ac180006", "from_page": 0, "to_page": 10, "retry_count": 0, "kb_id": "5c23770246ce11f08b8b0242ac180006", "parser_id": "presentation", "parser_config": {"layout_recognize": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "name": "test_doc.pdf", "type": "pdf", "location": "test_doc.pdf", "size": 1442605, "tenant_id": "ff5921f8f59b11ef88e50242ac120006", "language": "English", "embd_id": "bge-m3@Xinference", "pagerank": 0, "kb_parser_config": {"layout_recognize": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "img2txt_id": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "asr_id": "", "llm_id": "DeepSeek-R1-Distill-Qwen-32B___OpenAI-API@OpenAI-API-Compatible", "update_time": 1749654795251, "task_type": ""} 2025-06-11 23:13:19,566 INFO 31 HTTP Request: POST http://10.133.34.19:9998/v1/embeddings "HTTP/1.1 200 OK" 2025-06-11 23:13:19,576 INFO 31 HEAD http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006 [status:200 duration:0.003s] 2025-06-11 23:13:19,583 INFO 31 From minio(0.006399601930752397) test_doc.pdf/test_doc.pdf 2025-06-11 23:13:19,583 INFO 31 load_model /ragflow/rag/res/deepdoc/det.onnx reuses cached model 2025-06-11 23:13:19,586 INFO 31 load_model /ragflow/rag/res/deepdoc/rec.onnx reuses cached model 2025-06-11 23:13:19,587 INFO 31 load_model /ragflow/rag/res/deepdoc/layout.onnx reuses cached model 2025-06-11 23:13:19,587 INFO 31 load_model /ragflow/rag/res/deepdoc/tsr.onnx reuses cached model 2025-06-11 23:13:19,598 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: None, progress_msg: 23:13:19 Page(1~11): OCR started 2025-06-11 23:13:21,431 INFO 31 images dedupe_chars cost 1.8335928990272805s 2025-06-11 23:13:21,435 WARNING 31 Miss outlines 2025-06-11 23:13:21,530 INFO 31 __ocr detecting boxes of a image cost (0.09450889506842941s) 2025-06-11 23:13:21,531 INFO 31 __ocr sorting 53 chars cost 0.0006220480427145958s 2025-06-11 23:13:21,570 INFO 31 __ocr recognize 8 boxes cost 0.03930688195396215s 2025-06-11 23:13:21,669 INFO 31 __ocr detecting boxes of a image cost (0.09745385695714504s) 2025-06-11 23:13:21,670 INFO 31 __ocr sorting 43 chars cost 0.0005265669897198677s 2025-06-11 23:13:21,701 INFO 31 __ocr recognize 7 boxes cost 0.031114830053411424s 2025-06-11 23:13:21,809 INFO 31 __ocr detecting boxes of a image cost (0.10635609098244458s) 2025-06-11 23:13:21,813 INFO 31 __ocr sorting 409 chars cost 0.0029008679557591677s 2025-06-11 23:13:22,002 INFO 31 __ocr recognize 37 boxes cost 0.18889174505602568s 2025-06-11 23:13:22,108 INFO 31 __ocr detecting boxes of a image cost (0.10438321705441922s) 2025-06-11 23:13:22,110 INFO 31 __ocr sorting 181 chars cost 0.0016334600513800979s 2025-06-11 23:13:22,246 INFO 31 __ocr recognize 29 boxes cost 0.13544121000450104s 2025-06-11 23:13:22,354 INFO 31 __ocr detecting boxes of a image cost (0.10657399310730398s) 2025-06-11 23:13:22,356 INFO 31 __ocr sorting 204 chars cost 0.001860317075625062s 2025-06-11 23:13:22,568 INFO 31 __ocr recognize 34 boxes cost 0.2115059489151463s 2025-06-11 23:13:22,678 INFO 31 __ocr detecting boxes of a image cost (0.10764243407174945s) 2025-06-11 23:13:22,680 INFO 31 __ocr sorting 268 chars cost 0.001782367005944252s 2025-06-11 23:13:23,024 INFO 31 __ocr recognize 37 boxes cost 0.3447964320657775s 2025-06-11 23:13:23,030 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: 0.36, progress_msg: 2025-06-11 23:13:23,136 INFO 31 __ocr detecting boxes of a image cost (0.10478150693234056s) 2025-06-11 23:13:23,137 INFO 31 __ocr sorting 103 chars cost 0.0009583050850778818s 2025-06-11 23:13:23,796 INFO 31 __ocr recognize 38 boxes cost 0.6587542770430446s 2025-06-11 23:13:23,891 INFO 31 __ocr detecting boxes of a image cost (0.09396304795518517s) 2025-06-11 23:13:23,892 INFO 31 __ocr sorting 68 chars cost 0.0006430470384657383s 2025-06-11 23:13:23,925 INFO 31 __ocr recognize 5 boxes cost 0.03250362700782716s 2025-06-11 23:13:24,019 INFO 31 __ocr detecting boxes of a image cost (0.09265734406653792s) 2025-06-11 23:13:24,021 INFO 31 __ocr sorting 208 chars cost 0.001447585062123835s 2025-06-11 23:13:24,049 INFO 31 __ocr recognize 8 boxes cost 0.027606691932305694s 2025-06-11 23:13:24,133 INFO 31 __ocr detecting boxes of a image cost (0.08356666704639792s) 2025-06-11 23:13:24,134 INFO 31 __ocr sorting 11 chars cost 0.00021717208437621593s 2025-06-11 23:13:24,169 INFO 31 __ocr recognize 5 boxes cost 0.03483505605254322s 2025-06-11 23:13:24,170 INFO 31 images 10 pages cost 2.735132099944167s 2025-06-11 23:13:24,175 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: None, progress_msg: 23:13:24 Page(1~11): Page 0~10: OCR finished (4.58s) 2025-06-11 23:13:24,180 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: 0.9, progress_msg: 23:13:24 Page(1~11): Page 0~10: Parsing finished 2025-06-11 23:13:24,258 INFO 31 Chunking(4.682046517031267) test_doc.pdf/test_doc.pdf done 2025-06-11 23:13:24,407 INFO 31 MINIO PUT(test_doc.pdf) cost 0.148 s 2025-06-11 23:13:24,407 INFO 31 Build document test_doc.pdf: 4.83s 2025-06-11 23:13:24,411 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: None, progress_msg: 23:13:24 Page(1~11): Generate 10 chunks 2025-06-11 23:13:24,463 INFO 31 HTTP Request: POST http://10.133.34.19:9998/v1/embeddings "HTTP/1.1 200 OK" 2025-06-11 23:13:25,739 INFO 31 HTTP Request: POST http://10.133.34.19:9998/v1/embeddings "HTTP/1.1 200 OK" 2025-06-11 23:13:25,792 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: 0.72, progress_msg: 2025-06-11 23:13:25,792 INFO 31 Embedding chunks (1.38s) 2025-06-11 23:13:25,796 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: None, progress_msg: 23:13:25 Page(1~11): Embedding chunks (1.38s) 2025-06-11 23:13:25,816 INFO 31 PUT http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_bulk?refresh=false&timeout=60s [status:200 duration:0.016s] 2025-06-11 23:13:25,819 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: 0.81, progress_msg: 2025-06-11 23:13:25,839 INFO 31 PUT http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_bulk?refresh=false&timeout=60s [status:200 duration:0.015s] 2025-06-11 23:13:25,851 INFO 31 PUT http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_bulk?refresh=false&timeout=60s [status:200 duration:0.010s] 2025-06-11 23:13:25,853 INFO 31 Indexing doc(test_doc.pdf), page(0-10), chunks(10), elapsed: 0.06 2025-06-11 23:13:25,859 INFO 31 set_progress(99020a8246d611f0aba00242ac180006), progress: 1.0, progress_msg: 23:13:25 Page(1~11): Indexing done (0.06s). Task done (6.36s) 2025-06-11 23:13:25,859 INFO 31 Chunk doc(test_doc.pdf), page(0-10), chunks(10), token(1540), elapsed:6.36 2025-06-11 23:13:25,859 INFO 31 handle_task done for task {"id": "99020a8246d611f0aba00242ac180006", "doc_id": "98f3019a46d611f090cc0242ac180006", "from_page": 0, "to_page": 10, "retry_count": 0, "kb_id": "5c23770246ce11f08b8b0242ac180006", "parser_id": "presentation", "parser_config": {"layout_recognize": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "name": "test_doc.pdf", "type": "pdf", "location": "test_doc.pdf", "size": 1442605, "tenant_id": "ff5921f8f59b11ef88e50242ac120006", "language": "English", "embd_id": "bge-m3@Xinference", "pagerank": 0, "kb_parser_config": {"layout_recognize": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": false}}, "img2txt_id": "Qwen2.5-VL-7B-Instruct___VLLM@VLLM", "asr_id": "", "llm_id": "DeepSeek-R1-Distill-Qwen-32B___OpenAI-API@OpenAI-API-Compatible", "update_time": 1749654795251, "task_type": ""} 2025-06-11 23:13:27,619 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "GET /v1/tenant/list HTTP/1.1" 200 - 2025-06-11 23:13:27,620 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "GET /v1/user/info HTTP/1.1" 200 - 2025-06-11 23:13:27,623 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "GET /v1/user/tenant_info HTTP/1.1" 200 - 2025-06-11 23:13:27,627 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:27,630 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "GET /v1/kb/detail?kb_id=5c23770246ce11f08b8b0242ac180006 HTTP/1.1" 200 - 2025-06-11 23:13:27,632 INFO 18 HEAD http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006 [status:200 duration:0.003s] 2025-06-11 23:13:27,666 INFO 18 POST http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_search [status:200 duration:0.005s] 2025-06-11 23:13:27,666 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:27] "GET /v1/kb/5c23770246ce11f08b8b0242ac180006/knowledge_graph HTTP/1.1" 200 - 2025-06-11 23:13:30,883 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "GET /v1/tenant/list HTTP/1.1" 200 - 2025-06-11 23:13:30,883 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "GET /v1/user/info HTTP/1.1" 200 - 2025-06-11 23:13:30,888 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "GET /v1/user/tenant_info HTTP/1.1" 200 - 2025-06-11 23:13:30,889 INFO 18 HEAD http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006 [status:200 duration:0.003s] 2025-06-11 23:13:30,907 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "POST /v1/document/list?kb_id=5c23770246ce11f08b8b0242ac180006&keywords=&page_size=10&page=1 HTTP/1.1" 200 - 2025-06-11 23:13:30,909 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "GET /v1/kb/detail?kb_id=5c23770246ce11f08b8b0242ac180006 HTTP/1.1" 200 - 2025-06-11 23:13:30,924 INFO 18 POST http://es01:9200/ragflow_ff5921f8f59b11ef88e50242ac120006/_search [status:200 duration:0.005s] 2025-06-11 23:13:30,924 INFO 18 172.24.0.6 - - [11/Jun/2025 23:13:30] "GET /v1/kb/5c23770246ce11f08b8b0242ac180006/knowledge_graph HTTP/1.1" 200 -

Dreamcatcher-wind avatar Jun 11 '25 07:06 Dreamcatcher-wind

You are correct—this is a known limitation. As of v0.19.0, the Presentation chunking method (for both PDF and PPTX) does not support parsing via VLM (Vision Language Model) pipelines. Only the General and some other PDF chunking methods allow you to select a vision model for layout recognition or OCR. The Presentation method still relies on classic extraction (python-pptx, Aspose.Slides, or standard OCR for PDFs) and ignores any VLM configuration, even if you set layout_recognize to a VLM model in the parser config. There have been no changes in v0.19.0 to address this, and the architecture currently does not route Presentation files through VLM-based parsing logic. This is consistent with both the code and the documentation [source] [source].

If you need VLM-based parsing, you will need to use a chunking method that explicitly supports it (like General for PDFs). There is no workaround for Presentation chunking at this time.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Join Discord Share on X

dosubot[bot] avatar Jun 11 '25 07:06 dosubot[bot]

I did not quite follow. Why did you think it's not been resolved?

KevinHuSh avatar Jun 12 '25 05:06 KevinHuSh