PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Invalid init param "vl_rec_max_concurrency" for PaddleOCRVL

Open Deepwind64 opened this issue 2 months ago • 0 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

构造PaddleOCRVL实例时,参数vl_rec_max_concurrency无效,当输入长 pdf 时并发能来到 200,把我的转发服务干掉线了。

定位到问题在/home/xxx/miniconda3/envs/paddlevllm/lib/python3.13/site-packages/paddleocr/_pipelines/paddleocr_vl.py_get_paddlex_config_overrides方法的STRUCTURE变量。变量中缺失了该参数对应的配置代码。

添加下面代码可以修正该问题:

"SubModules.VLRecognition.genai_config.max_concurrency": self._params[
                "vl_rec_max_concurrency"
],

🏃‍♂️ Environment (运行环境)

OS Ubuntu 24.04 LTS
python 3.13
paddleocr 3.3.1
paddlepaddle-gpu 3.2.0
paddlex 3.3.8

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

from pathlib import Path
from paddleocr import PaddleOCRVL

input_file = "./mamba.pdf"
output_path = Path("./output")

pipeline = PaddleOCRVL(use_layout_detection=True,
                       vl_rec_max_concurrency=1, # key parameter
                       vl_rec_backend="vllm-server",
                       vl_rec_server_url="http://192.168.1.7:8082/v1")
output = pipeline.predict(input=input_file)

print("page count", len(output))
markdown_list = []
markdown_images = []

for res in output:
    md_info = res.markdown
    markdown_list.append(md_info)
    markdown_images.append(md_info.get("markdown_images", {}))

markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)

mkd_file_path = output_path / f"{Path(input_file).stem}.md"
mkd_file_path.parent.mkdir(parents=True, exist_ok=True)

with open(mkd_file_path, "w", encoding="utf-8") as f:
    f.write(markdown_texts)

for item in markdown_images:
    if item:
        for path, image in item.items():
            file_path = output_path / path
            file_path.parent.mkdir(parents=True, exist_ok=True)
            image.save(file_path)

Deepwind64 avatar Nov 09 '25 07:11 Deepwind64