运行PP-StructureV3报错,文本方向检测有问题
🔎 Search before asking
- [x] I have searched the PaddleOCR Docs and found no similar bug report.
- [x] I have searched the PaddleOCR Issues and found no similar bug report.
- [x] I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
报错信息如下:
Set use_doc_orientation_classify, but the model for doc orientation classify is not initialized.
Traceback (most recent call last):
File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/yyx/embedding_reranker_qwen3_test copy.py", line 10, in
🏃♂️ Environment (运行环境)
Ubuntu 22.04 Python 3.10 Paddle 3.0.0
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
from paddleocr import PPStructureV3
input_file = "..."
pipeline = PPStructureV3() output = pipeline.predict( input=input_file, use_doc_orientation_classify = True, use_doc_unwarping = True, doc_orientation_classify_model_name="PP-LCNet_x1_0_doc_ori" )
markdown_list = []
for res in output: md_info = res.markdown markdown_list.append(md_info)
markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)
pipeline = PaddleOCRVL( vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:2024/v1", use_layout_detection=False, layout_detection_model_name="PP-DocLayoutV2", layout_detection_model_dir="/models/PaddleOCR-VL/PP-DocLayoutV2", ) ### 试一下这个,应该可以
🔎 Search before asking
- [x] I have searched the PaddleOCR Docs and found no similar bug report.我搜索了 PaddleOCR 文档 ,没有找到类似的错误报告。[x] I have searched the PaddleOCR Issues and found no similar bug report.我搜索了 PaddleOCR 问题 ,没有发现类似的错误报告。[x] I have searched the PaddleOCR Discussions and found no similar bug report.我搜索了 PaddleOCR 讨论 ,没有找到类似的错误报告。
🐛 Bug (问题描述)
报错信息如下: Set use_doc_orientation_classify, but the model for doc orientation classify is not initialized.设置 use_doc_orientation_classify,但文档方向分类的模型未初始化。 Traceback (most recent call last):回溯(最近一次调用最后): File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/yyx/embedding_reranker_qwen3_test copy.py", line 10, in 文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/yyx/embedding_reranker_qwen3_test copy.py”,第 10 行,在 output = pipeline.predict(输出 = pipeline.predict( File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddleocr/_pipelines/pp_structurev3.py", line 250, in predict文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddleocr/_pipelines/pp_structurev3.py”,第 250 行,在 predict 中 return list( 返回列表( File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py", line 129, in predict文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py”,第 129 行,在 predict 中 yield from self._pipeline.predict(yield 来自 self._pipeline.predict( File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 993, in predict文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py”,第 993 行,在 predict 中 doc_preprocessor_results = list(doc_preprocessor_results = 列表( File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py", line 129, in predict文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py”,第 129 行,在 predict 中 yield from self._pipeline.predict(yield 来自 self._pipeline.predict( File "/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/doc_preprocessor/pipeline.py", line 162, in predict文件“/HOME/sysucc_huiyanluo/sysucc_huiyanluo_1/HDD_POOL/miniconda3/envs/gj_title/lib/python3.10/site-packages/paddlex/inference/pipelines/doc_preprocessor/pipeline.py”,第 162 行,在 predict 中 preds = list(self.doc_ori_classify_model(image_arrays))preds = 列表(self.doc_ori_classify_model(image_arrays)) AttributeError: '_DocPreprocessorPipeline' object has no attribute 'doc_ori_classify_model'AttributeError: '_DocPreprocessorPipeline' 对象没有属性 'doc_ori_classify_model'
🏃♂️ Environment (运行环境)
Ubuntu 22.04 Ubuntu 22.04 的 Python 3.10 Python 3.10 中文文档 Paddle 3.0.0 桨 3.0.0
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
from paddleocr import PPStructureV3来自 paddleocr import PPStructureV3
input_file = "..." input_file = “...”
pipeline = PPStructureV3()流水线 = PPStructureV3() output = pipeline.predict(输出 = pipeline.predict( input=input_file, 输入=input_file, use_doc_orientation_classify = True,use_doc_orientation_classify = 真, use_doc_unwarping = True,use_doc_unwarping = True, doc_orientation_classify_model_name="PP-LCNet_x1_0_doc_ori"doc_orientation_classify_model_name=“PP-LCNet_x1_0_doc_ori” )
markdown_list = []
for res in output: 对于输出中的 res: md_info = res.markdown markdown_list.append(md_info)markdown_list.append(md_info)
markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)markdown_texts = pipeline.concatenate_markdown_pages(markdown_list)
试下如下代码,应该可以:
pipeline = PaddleOCRVL(
vl_rec_backend="vllm-server",
vl_rec_server_url="http://127.0.0.1:2024/v1",
use_layout_detection=False,
layout_detection_model_name="PP-DocLayoutV2",
layout_detection_model_dir="/models/PaddleOCR-VL/PP-DocLayoutV2",
)