PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PPStructureV3在内网环境无法加载本地模型

Open Jpzhaoo opened this issue 2 months ago • 8 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

在离线环境将模型下载到了/root/.paddlex/official_models 下面,但是在初始化的时候始终报错"No available model hosting platforms detected. Please check your network"

Image

Image

🏃‍♂️ Environment (运行环境)

Linux node6 5.4.0.26-generic #30-Ubuntu x86_64

python 3.10.12

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

from paddleocr import PPStructureV3

pipeline = PPStructureV3(layout_detection_model_dir="/root/.paddlex/official_models/PP-DocBlockLayout/", table_classification_model_dir="/root/.paddlex/official_models/PP-LCNet_x1_0_table_cls/", text_detection_model_dir="/root/.paddlex/official_models/PP-OCRv5_server_det/", text_recognition_model_dir="/root/.paddlex/official_models/PP-OCRv5_server_rec")

Jpzhaoo avatar Oct 15 '25 07:10 Jpzhaoo

I had a similiar problem and solved it the following way

Run the following to extract the config to a .yaml file

from paddleocr import PPStructureV3
pipeline = PPStructureV3()
pipeline.export_paddlex_config_to_yaml("PP-StructureV3.yaml")

Open PP-StructureV3.yaml and replace each model_dir: null entry with your local path. One example:

RegionDetection:
    layout_merge_bboxes_mode: small
    layout_nms: true
    model_dir: /pathtofolder/.paddlex/official_models/PP-DocBlockLayout
    model_name: PP-DocBlockLayout
    module_name: layout_detection

Then, instead of calling

pipeline = PPStructureV3(layout_detection_model_dir="/root/.paddlex/official_models/PP-DocBlockLayout/", table_classification_model_dir="/root/.paddlex/official_models/PP-LCNet_x1_0_table_cls/", text_detection_model_dir="/root/.paddlex/official_models/PP-OCRv5_server_det/", text_recognition_model_dir="/root/.paddlex/official_models/PP-OCRv5_server_rec")`

where each model is provided individually, use

pipeline = PPStructureV3(
        paddlex_config="PP-StructureV3.yaml",
    )

I hope this helps you. Otherwise have a look at this part of the documentation https://www.paddleocr.ai/main/en/version3.x/pipeline_usage/PP-StructureV3.html#42-model-deployment

NicoMigenda avatar Oct 17 '25 05:10 NicoMigenda

你初始化pipeline时的参数定义的有问题,像你报错信息就是代码在加载PP-DocBlockLayout目录,而这个对应的参数应该是region_detection_model_dir,至于layout_detection_model_dir,应该指向PP-DocLayout_plus-L这个目录,其余参考它给的文档吧,14个参数都定义对就能内网加载了

oreo1024 avatar Oct 20 '25 01:10 oreo1024

@Jpzhaoo 使用最新的PaddleX(3.3.5版本及以上),如果推理所需的模型文件在/root/.paddlex/official_models/目录下已存在,那么在没有网络的情况下也可以正常推理。可以再试试。

TingquanGao avatar Oct 27 '25 11:10 TingquanGao

@Jpzhaoo 使用最新的PaddleX(3.3.5版本及以上),如果推理所需的模型文件在/root/.paddlex/official_models/目录下已存在,那么在没有网络的情况下也可以正常推理。可以再试试。

我的是3.3.5,但是依然识别不了/root/.paddlex/official_models/目录下的模型,只能全部列出来,才可以

gsm1258 avatar Oct 27 '25 12:10 gsm1258

抱歉,我又确认了下,是需要改一点代码才能支持。我已经修改提PR,将尽快合入,并于近期发出3.3.6版本。 https://github.com/PaddlePaddle/PaddleX/pull/4676

TingquanGao avatar Oct 27 '25 14:10 TingquanGao

我在内网环境里面无法下载模型,/root/.paddlex/official_models目录下是空的,应该怎么处理?从哪能拉到这些文件?

yuexingliang avatar Oct 28 '25 07:10 yuexingliang

抱歉,我又确认了下,是需要改一点代码才能支持。我已经修改提PR,将尽快合入,并于近期发出3.3.6版本。 PaddlePaddle/PaddleX#4676

我也碰到这个问题了,如果需要申请外网的话,需要申请什么域名?因为我们这边只能按照域名来申请

yuexingliang avatar Oct 28 '25 10:10 yuexingliang

@yuexingliang 有两个方式:

  1. 手动提前下载好模型,具体模型的下载方式,可以通过以下模型托管平台获得huggingfaceaistudiomodelscope
  2. 如果是要申请域名,可以添加以下域名前缀(或其中任意一个):huggingface.coaistudio.baidu.commodelscope.cnpaddle-model-ecology.bj.bcebos.com

TingquanGao avatar Oct 30 '25 06:10 TingquanGao