PaddleOCR ch_PP-OCRv4_rec_svtr_large.yml训练导出的模型，使用predict_rec.py预测宽度比较大的图片时出现(InvalidArgument) Broadcast dimension mismatch

ch_PP-OCRv4_rec_svtr_large.yml训练导出的模型，使用predict_rec.py预测宽度比较大的图片时出现(InvalidArgument) Broadcast dimension mismatch

Open LMR2018 opened this issue 9 months ago • 9 comments

使用ch_PP-OCRv4_rec_svtr_large.yml训练的OCR识别模型，训练正常，使用python tools/eval.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_svtr_large.yml也是正常的，用python tools/export_model.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_svtr_large.yml 也是能成功导出模型的用export_model.py导出的模型，使用python tools/infer/predict_rec.py预测宽度不太长的单行图片也是能正常预测的，但是predict_rec.py预测宽度比较大的图片时出现：(InvalidArgument) Broadcast dimension mismatch错误即使是训练、验证用的宽度比较大的图片也是出现这个错误，这个怎么解决？训练配置：image_shape: [3, 48, 320]， max_text_length: &max_text_length 50

`[2024/05/25 14:48:36] ppocr INFO: Traceback (most recent call last): File "tools/infer/predict_rec.py", line 728, in main rec_res, _ = text_recognizer(img_list) File "tools/infer/predict_rec.py", line 675, in call self.predictor.run() ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 2148, 192] and the shape of Y = [1, 960, 192]. Received [2148] in X is not equal to [960] in Y at i:1. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86) [operator < elementwise_add > error]

[2024/05/25 14:48:36] ppocr INFO: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 2148, 192] and the shape of Y = [1, 960, 192]. Received [2148] in X is not equal to [960] in Y at i:1. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86) [operator < elementwise_add > error]`

May 25 '24 09:05 LMR2018

PaddleOCR PaddleOCR copied to clipboard

ch_PP-OCRv4_rec_svtr_large.yml训练导出的模型，使用predict_rec.py预测宽度比较大的图片时出现(InvalidArgument) Broadcast dimension mismatch

PaddleOCR
PaddleOCR copied to clipboard