PaddleOCR 请问我指定使用vl模型服务，但是还要给我下载PP-DocLayoutV2

🔎 Search before asking

[x] I have searched the PaddleOCR Docs and found no similar bug report.
[x] I have searched the PaddleOCR Issues and found no similar bug report.
[x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

请问一下，我置顶了vl模型服务地址了，但是为什么我在做pdf解析转换成markdown的时候还是要给我本地下载PP-DocLayoutV2这个版面识别模型呢，以下是我的代码 from pathlib import Path from paddleocr import PaddleOCRVL

input_file = "./MV SAGA FALCON - DD25 - Specs (REV.1)(1).pdf" output_path = Path("./output")

pipeline = PaddleOCRVL(vl_rec_backend="vllm-server",vl_rec_server_url="https://cloud.infini-ai.com/AIStudio/inference/api/if-dbxt4f5uxecbvsyl/v1") output = pipeline.predict(input=input_file)

markdown_list = [] markdown_images = []

for res in output: md_info = res.markdown markdown_list.append(md_info) markdown_images.append(md_info.get("markdown_images", {}))

markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)

mkd_file_path = output_path / f"{Path(input_file).stem}.md" mkd_file_path.parent.mkdir(parents=True, exist_ok=True)

with open(mkd_file_path, "w", encoding="utf-8") as f: f.write(markdown_texts)

for item in markdown_images: if item: for path, image in item.items(): file_path = output_path / path file_path.parent.mkdir(parents=True, exist_ok=True) image.save(file_path)

🏃‍♂️ Environment (运行环境)

paddleocr[doc-parser] paddlepaddle==3.2.0

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

from pathlib import Path from paddleocr import PaddleOCRVL

input_file = "./MV SAGA FALCON - DD25 - Specs (REV.1)(1).pdf" output_path = Path("./output")

pipeline = PaddleOCRVL(vl_rec_backend="vllm-server",vl_rec_server_url="https://cloud.infini-ai.com/AIStudio/inference/api/if-dbxt4f5uxecbvsyl/v1") output = pipeline.predict(input=input_file)

markdown_list = [] markdown_images = []

for res in output: md_info = res.markdown markdown_list.append(md_info) markdown_images.append(md_info.get("markdown_images", {}))

markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)

mkd_file_path = output_path / f"{Path(input_file).stem}.md" mkd_file_path.parent.mkdir(parents=True, exist_ok=True)

with open(mkd_file_path, "w", encoding="utf-8") as f: f.write(markdown_texts)

for item in markdown_images: if item: for path, image in item.items(): file_path = output_path / path file_path.parent.mkdir(parents=True, exist_ok=True) image.save(file_path)

Oct 17 '25 11:10 Stefan3Zz

我的理解这个模型不是一次性搞定的吗不应该走pipeline了吧

Oct 17 '25 11:10 Stefan3Zz

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

Oct 17 '25 14:10 cuicheng01

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。但是我PP-DocLayutV2 也下载下来了，但是我想问下，可以把PP-DocLayutV2 和PaddleOCR-VL-0.9B 包装成一个服务吗就像mineru2.5，我可以直接通过接口调用就能拿到所有版面信息以及内容信息比如我通过docker run
-it
--rm
--gpus all
--network host
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlex-genai-vllm-server
paddlex_genai_server --model_name PaddleOCR-VL-0.9B --host 0.0.0.0 --port 8118 --backend vllm

能一起把PP-DocLayutV2 跑起来，这样我就可以用业务服务调用了，还有一个问题就是pipeline = PaddleOCRVL(vl_rec_backend="vllm-server",vl_rec_server_url="https://cloud.infini-ai.com/AIStudio/inference/api/if-dbxt4f5uxecbvsyl/v1")， https://cloud.infini-ai.com/AIStudio/inference/api/if-dbxt4f5uxecbvsyl/v1 我这个服务需要header里面放auth 鉴权才能调用，应该怎么加参数

Oct 20 '25 01:10 Stefan3Zz

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

你好我想请问下使用什么参数能指定为使用我自己本地其他路径的PP-DocLayutV2？

Oct 20 '25 13:10 lizipao

请问我如何在服务器上配置一整个pipeline的API，而不是把layout detect工作放在client侧执行？直觉上这个方法应该只上传文件，所有文档处理都放到服务端运行，而不是本地处理。

Oct 21 '25 05:10 SirlyDreamer

请问我如何在服务器上配置一整个pipeline的API，而不是把layout detect工作放在client侧执行？直觉上这个方法应该只上传文件，所有文档处理都放到服务端运行，而不是本地处理。

是的请问如何解决呢

Oct 21 '25 09:10 qianchen94

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

你好，只使用VLM部分要怎么使用呢，有相关文档吗

Oct 22 '25 02:10 skyhawk1990

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

你好我想请问下使用什么参数能指定为使用我自己本地其他路径的PP-DocLayutV2？

您可以通过使用layout_detection_model_dir参数传入本地PP-DocLayutV2模型路径

Oct 30 '25 03:10 changdazhou

请问我如何在服务器上配置一整个pipeline的API，而不是把layout detect工作放在client侧执行？直觉上这个方法应该只上传文件，所有文档处理都放到服务端运行，而不是本地处理。

您可以参考官方文档的第4节，使用服务化部署方式实现：https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html

Oct 30 '25 03:10 changdazhou

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

你好，只使用VLM部分要怎么使用呢，有相关文档吗

可以参考官方文档，通过设置 use_layout_detection=Flase ，并指定 prompt_label 类型来实现，默认为 OCR 识别

Oct 30 '25 03:10 changdazhou

您好，PP-DocLayutV2是PaddleOCR-VL中负责文档的版面的检测和阅读顺序的部分，如果只是单纯的文字识别，可以直接用其中的VLM部分PaddleOCR-VL-0.9B模型，但是可能有更多的人做文档解析，所以增加了PP-DocLayutV2。另外，PP-DocLayoutV2的参数量很少，所以不会占用很多空间。

你好，只使用VLM部分要怎么使用呢，有相关文档吗

可以参考官方文档，通过设置 use_layout_detection=Flase ，并指定 prompt_label 类型来实现，默认为 OCR 识别

指定 prompt_label 类型有哪些枚举值？

Nov 06 '25 14:11 huangzhuohua2

ocr、table、formula 和 chart

Nov 10 '25 02:11 changdazhou