PaddleOCR [ppStructure]InvalidArgumentError: The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'.

[ppStructure]InvalidArgumentError: The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'.

Open icemeowzhi opened this issue 1 year ago • 1 comments

我尝试将英文pdf转换为每张图片，并对每张图片进行版面分析+表格识别。

我使用的默认模型，因为用户文件夹存在中文路径，我将涉及到的paddleocr和paddleclas模型复制到了项目根目录并重新指定了路径。

默认指定的layout_dict会导致indexOutOfBound，我重新指定了layout_dict_path。

我对这方面并不熟悉，请见谅。

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：win11，CUDA12.0
版本号/Version：Paddle：2.5.2.post120 PaddleOCR：2.7.0.3
问题相关组件/Related components：ppStructure
运行指令/Command Code：

def process(file_path, output_path):
    table_engine = PPStructure(show_log=True, image_orientation=True,
                               det_model_dir="./model/whl/det/en/en_PP-OCRv4_det_infer",
                               rec_model_dir="./model/whl/rec/en/en_PP-OCRv4_rec_infer",
                               cls_model_dir="./model/whl/cls/ch_ppocr_mobile_v2.0_cls_infer",
                               layout_model_dir="./picodet_lcnet_x1_0_fgd_layout_cdla_infer",
                               table_model_dir="./model/whl/table/ch_ppstructure_mobile_v2.0_SLANet_infer",
                               lang="en",
                               layout_dict_path='G:\\project\\python\\paddle-imagepdf-ocr\\venv\\lib\\site-packages\\paddleocr\\ppocr\\utils\\dict\\layout_dict\\layout_cdla_dict.txt')

    import fitz
    from PIL import Image
    with fitz.open(file_path) as pdf:
        for page in pdf:
            pix = page.get_pixmap()
            img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
            img.save(fp=("./" + str(output_path) + "/PIL_" + str(page.number) + ".png"), format="png")
            img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
            result = table_engine(img)
            save_structure_res(result, output_path, 'result_page_{}.jpg'.format(page.number))

            for line in result:
                line.pop('img')
                print(line)

完整报错/Complete Error Message：

Traceback (most recent call last):
  File "G:\project\python\paddle-imagepdf-ocr\main.py", line 24, in <module>
    ocr_main.start(config)
  File "G:\project\python\paddle-imagepdf-ocr\ocr\ocr_main.py", line 19, in start
    process(file, output_path)
  File "G:\project\python\paddle-imagepdf-ocr\ocr\ocr_main.py", line 40, in process
    result = table_engine(img)
  File "G:\project\python\paddle-imagepdf-ocr\venv\lib\site-packages\paddleocr\paddleocr.py", line 759, in __call__
    res, _ = super().__call__(
  File "G:\project\python\paddle-imagepdf-ocr\venv\lib\site-packages\paddleocr\ppstructure\predict_system.py", line 144, in __call__
    filter_boxes, filter_rec_res, ocr_time_dict = self.text_system(
  File "G:\project\python\paddle-imagepdf-ocr\venv\lib\site-packages\paddleocr\tools\infer\predict_system.py", line 76, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "G:\project\python\paddle-imagepdf-ocr\venv\lib\site-packages\paddleocr\tools\infer\predict_det.py", line 245, in __call__
    self.predictor.run()
ValueError: In user code:

    File "tools/export_model.py", line 289, in <module>
      main()
    File "tools/export_model.py", line 285, in main
      model, arch_config, save_path, logger, input_shape=input_shape)
    File "tools/export_model.py", line 198, in export_single_model
      paddle.jit.save(model, save_path)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/decorator.py", line 232, in fun
      return caller(func, *(extras + args), **kw)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 649, in wrapper
      func(layer, path, input_spec, **configs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/decorator.py", line 232, in fun
      return caller(func, *(extras + args), **kw)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 52, in __impl__
      return func(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 928, in save
      inner_input_spec, with_hook=with_hook)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 577, in concrete_program_specify_input_spec
      is_train=self._is_train_mode())
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 482, in get_concrete_program
      concrete_program, partial_program_layer = self._program_cache[cache_key]
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 952, in __getitem__
      self._caches[item_id] = self._build_once(item)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 941, in _build_once
      **cache_key.kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/decorator.py", line 232, in fun
      return caller(func, *(extras + args), **kw)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 26, in __impl__
      return wrapped_func(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 52, in __impl__
      return func(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 885, in from_func_spec
      outputs = static_func(*inputs)
    File "/workspace/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 99, in forward
      if self.use_head:
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 324, in convert_ifelse
      return_name_ids)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 380, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/workspace/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 100, in forward
      x = self.head(x, targets=data)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 950, in __call__
      return self._dygraph_call_func(*inputs, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 935, in _dygraph_call_func
      outputs = self.forward(*inputs, **kwargs)
    File "/workspace/PaddleOCR/ppocr/modeling/heads/rec_multi_head.py", line 92, in forward
      ctc_encoder = self.ctc_encoder(x)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 950, in __call__
      return self._dygraph_call_func(*inputs, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 935, in _dygraph_call_func
      outputs = self.forward(*inputs, **kwargs)
    File "/workspace/PaddleOCR/ppocr/modeling/necks/rnn.py", line 255, in forward
      if self.encoder_type != 'svtr':
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 324, in convert_ifelse
      return_name_ids)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 380, in _run_py_ifelse
      py_outs = true_fn() if pred else false_fn()
    File "/workspace/PaddleOCR/ppocr/modeling/necks/rnn.py", line 261, in forward
      x = self.encoder(x)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 950, in __call__
      return self._dygraph_call_func(*inputs, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 935, in _dygraph_call_func
      outputs = self.forward(*inputs, **kwargs)
    File "/workspace/PaddleOCR/ppocr/modeling/necks/rnn.py", line 217, in forward
      z = z.reshape([0, H, W, C]).transpose([0, 3, 1, 2])
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/tensor/manipulation.py", line 3395, in reshape
      "XShape": x_shape
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 45, in append_op
      return self.main_program.current_block().append_op(*args, **kwargs)
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3828, in append_op
      attrs=kwargs.get("attrs", None))
    File "/root/anaconda3/envs/py37/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2736, in __init__
      for frame in traceback.extract_stack():

    InvalidArgumentError: The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'. But received X's shape = [1, 360, 120], X's size = 43200, 'shape' is [0, 1, 120, 120], the capacity of 'shape' is 14400.
      [Hint: Expected capacity == in_size, but received capacity:14400 != in_size:43200.] (at ..\paddle\phi\infermeta\unary.cc:1781)
      [operator < reshape2 > error]

我们提供了AceIssueSolver来帮助你解答问题，你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):yes

请尽量不要包含图片在问题中/Please try to not include the image in the issue.

Nov 18 '23 17:11 icemeowzhi

en_PP-OCRv4_det 这个模型有下载链接吗？我怎么找不到？

Jun 09 '24 14:06 nissansz

PaddleOCR PaddleOCR copied to clipboard

[ppStructure]InvalidArgumentError: The 'shape' in ReshapeOp is invalid. The input tensor X'size must be equal to the capacity of 'shape'.

PaddleOCR
PaddleOCR copied to clipboard