PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PPStructureV3 某些图片报错

Open chop2 opened this issue 1 month ago • 1 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

File "/data1/workspace/LLM/file_parser/app/parser_core/paddleocr_adapter.py", line 81, in do_parse_paddle output = pipeline.predict(input=tmp_path) │ │ └ '/tmp/tmp4ojyw0ic.pdf' │ └ <function PPStructureV3.predict at 0x7fbb225367a0> └ <paddleocr._pipelines.pp_structurev3.PPStructureV3 object at 0x7fbce0a8a530>

File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddleocr/_pipelines/pp_structurev3.py", line 250, in predict return list( File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/_parallel.py", line 129, in predict yield from self._pipeline.predict( │ │ └ <function _LayoutParsingPipelineV2.predict at 0x7fbb237f0940> │ └ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0> └ <paddlex.inference.pipelines.layout_parsing.pipeline_v2.LayoutParsingPipelineV2 object at 0x7fba10f33400> File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 1245, in predict parsing_res_list = self.get_layout_parsing_res( │ └ <function _LayoutParsingPipelineV2.get_layout_parsing_res at 0x7fbb237f0820> └ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0> File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 811, in get_layout_parsing_res self.standardized_data( │ └ <function _LayoutParsingPipelineV2.standardized_data at 0x7fbb237f0670> └ <paddlex.inference.pipelines.layout_parsing.pipeline_v2._LayoutParsingPipelineV2 object at 0x7fba10f31ff0> File "/data1/workspace/LLM/file_parser/venv/lib/python3.10/site-packages/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py", line 496, in standardized_data layout_det_res["boxes"].append( └ {'input_path': None, 'page_index': None, 'input_img': array([[[221, 228, 222], [221, 228, 222], [221, 229, 22...

AttributeError: 'numpy.ndarray' object has no attribute 'append'

报错

🏃‍♂️ Environment (运行环境)

paddleocr                 3.3.1
paddlepaddle-gpu          3.2.1
paddlex                   3.3.9
numpy                     2.2.6

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

def test_paddleocr_structure():
    from pathlib import Path
    from paddleocr import PPStructureV3
    from paddlex.inference.pipelines.layout_parsing.result_v2 import LayoutParsingResultV2

    pipeline = PPStructureV3(
        use_doc_orientation_classify=False,
        use_doc_unwarping=False
    )

    # For Image
    output = pipeline.predict(
        input="https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png",
        )

    # 可视化结果并保存 json 结果
    for res in output:
        res:LayoutParsingResultV2 = res
        print(res)
        # res.print() 
        res.save_to_json(save_path="output") 
        res.save_to_markdown(save_path="output") 

chop2 avatar Nov 13 '25 08:11 chop2

感谢反馈,这个问题我已经提了一个 PR 来修复PaddlePaddle/PaddleX/pull/4731,等合入并发布新版本后应该就不会再出现了。

scyyh11 avatar Nov 15 '25 09:11 scyyh11