PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PaddleOCR 2.5 版本 ,use_tensorrt=True 跑不通

Open shihaitao118 opened this issue 2 years ago • 40 comments

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:ubuntu18.04 cuda 10.2 cudnn8 python3.7 tensorrt 7.2.3.4
  • paddle2onnx 0.5 paddlehub 1.8.3 paddleocr 2.5.0.3 paddlepaddle-gpu 2.3.0 paddleslim 1.1.1 paddlex 1.3.7
  • 版本号/Version:Paddle:2.3.0 PaddleOCR:2.5.0.3 问题相关组件/Related components:tensorrt
  • 运行指令/Command Code:--use_tensorrt==true
  • 完整报错/Complete Error Message: [2022/06/21 13:07:18] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) W0621 13:07:21.006245 894 analysis_predictor.cc:1086] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0621 13:07:21.030115 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:21.065781 894 fuse_pass_base.cc:57] --- detected 10 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:21.130373 894 fuse_pass_base.cc:57] --- detected 185 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:21.150663 894 fuse_pass_base.cc:57] --- detected 24 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] I0621 13:07:21.155673 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [fc_fuse_pass] I0621 13:07:21.157024 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:21.167703 894 fuse_pass_base.cc:57] --- detected 42 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:21.179683 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 133 nodes I0621 13:07:21.196204 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:21.737630 894 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0621 13:07:42.235085 894 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0621 13:07:42.255204 894 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0621 13:07:42.259418 894 memory_optimize_pass.cc:216] Cluster name : shape_1.tmp_0_slice_0 size: 4 I0621 13:07:42.259449 894 memory_optimize_pass.cc:216] Cluster name : shape_0.tmp_0 size: 16 I0621 13:07:42.259457 894 memory_optimize_pass.cc:216] Cluster name : reshape2_0.tmp_1 size: 0 I0621 13:07:42.259477 894 memory_optimize_pass.cc:216] Cluster name : linear_1.tmp_1 size: 8 --- Running analysis [ir_graph_to_program_pass] I0621 13:07:42.308853 894 analysis_predictor.cc:1007] ======= optimize end ======= I0621 13:07:42.312048 894 naive_executor.cc:102] --- skip [feed], feed -> x I0621 13:07:42.312656 894 naive_executor.cc:102] --- skip [save_infer_model/scale_0.tmp_1], fetch -> fetch I0621 13:07:42.422236 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:42.465929 894 fuse_pass_base.cc:57] --- detected 2 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:42.529808 894 fuse_pass_base.cc:57] --- detected 184 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] I0621 13:07:42.538357 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:42.548557 894 fuse_pass_base.cc:57] --- detected 19 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] I0621 13:07:42.553280 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] I0621 13:07:42.554414 894 fuse_pass_base.cc:57] --- detected 4 subgraphs --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] I0621 13:07:42.557754 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:42.563249 894 fuse_pass_base.cc:57] --- detected 23 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:42.572366 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 57 nodes I0621 13:07:42.577783 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:42.580443 894 op_converter.h:253] trt input [pool2d_5.tmp_0_clone_0] dynamic shape info not set, please check and retry. Traceback (most recent call last): File "3.py", line 18, in ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch') File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 437, in init super().init(params) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 47, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_rec.py", line 74, in init utility.create_predictor(args, 'rec', logger) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/utility.py", line 313, in create_predictor predictor = inference.create_predictor(config) ValueError: (InvalidArgument) some trt inputs dynamic shape info not set, check the INFO log above for more details. [Hint: Expected all_dynamic_shape_set == true, but received all_dynamic_shape_set:0 != true:1.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:287)

代码 import os import sys import time

import cv2

sys.path.insert(0, '/usr/local/python3.7/lib/python3.7/site-packages/paddleocr')

from paddleocr import PaddleOCR

PKG_PATTERN = r'PKG.*:'

root_path = os.path.join('/home', 'ocr_model') cls_model_dir = os.path.join(root_path, 'cls_infer') det_model_dir = os.path.join(root_path, 'v3_det_infer/ch_PP-OCRv3_det_infer') rec_model_dir = os.path.join(root_path, 'v3_rec_infer/ch_PP-OCRv3_rec_infer') addleOCR = PaddleOCR(cls_model_dir=cls_model_dir, det_model_dir=det_model_dir, rec_model_dir=rec_model_dir, ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch')

frame = cv2.imread('/home/095.png') print(frame.shape) while True: s1 = time.time() result = addleOCR.ocr(frame, cls=False) print('exec time:' + str(time.time() - s1)) print(result)

shihaitao118 avatar Jun 21 '22 13:06 shihaitao118

看报错是识别部分,可以先检查下代码里有没有这一行:https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L279,如果加上还报错

参考FAQ修改下吧:https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/FAQ.md#q-trt%E9%A2%84%E6%B5%8B%E6%8A%A5%E9%94%99invalidargumenterror-some-trt-inputs-dynamic-shape-info-not-set-check-the-info-log-above-for-more-details

LDOUBLEV avatar Jun 22 '22 08:06 LDOUBLEV

看报错是识别部分,可以先检查下代码里有没有这一行:https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L279,如果加上还报错

参考FAQ修改下吧:https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/FAQ.md#q-trt%E9%A2%84%E6%B5%8B%E6%8A%A5%E9%94%99invalidargumenterror-some-trt-inputs-dynamic-shape-info-not-set-check-the-info-log-above-for-more-details

加上那一行代码后,推理部分报错了,报错信息如下: [2022/06/22 09:16:28] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) (1080, 1440, 3) [2022/06/22 09:17:05] ppocr WARNING: Since the angle classifier is not initialized, the angle classifier will not be uesd during the forward process Traceback (most recent call last): File "1.py", line 24, in result = addleOCR.ocr(frame) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 474, in ocr dt_boxes, rec_res = self.call(img, cls) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 69, in call dt_boxes, elapse = self.text_detector(img) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_det.py", line 242, in call post_result = self.postprocess_op(preds, shape_list) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/postprocess/db_postprocess.py", line 176, in call pred = pred[:, 0, :, :] IndexError: too many indices for array: array is 2-dimensional, but 4 were indexed

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

看报错是识别部分,可以先检查下代码里有没有这一行:https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L279,如果加上还报错

参考FAQ修改下吧:https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/FAQ.md#q-trt%E9%A2%84%E6%B5%8B%E6%8A%A5%E9%94%99invalidargumenterror-some-trt-inputs-dynamic-shape-info-not-set-check-the-info-log-above-for-more-details

完整日志如下: [2022/06/22 09:21:28] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) W0622 09:21:30.310395 269 analysis_predictor.cc:1086] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0622 09:21:30.334146 269 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0622 09:21:30.370388 269 fuse_pass_base.cc:57] --- detected 10 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0622 09:21:30.435196 269 fuse_pass_base.cc:57] --- detected 185 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0622 09:21:30.456609 269 fuse_pass_base.cc:57] --- detected 24 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] I0622 09:21:30.462234 269 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [fc_fuse_pass] I0622 09:21:30.463685 269 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0622 09:21:30.474581 269 fuse_pass_base.cc:57] --- detected 42 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0622 09:21:30.486963 269 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 133 nodes I0622 09:21:30.503855 269 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0622 09:21:31.007433 269 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0622 09:21:51.494590 269 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0622 09:21:51.514413 269 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0622 09:21:51.518471 269 memory_optimize_pass.cc:216] Cluster name : shape_1.tmp_0_slice_0 size: 4 I0622 09:21:51.518503 269 memory_optimize_pass.cc:216] Cluster name : shape_0.tmp_0 size: 16 I0622 09:21:51.518512 269 memory_optimize_pass.cc:216] Cluster name : reshape2_0.tmp_1 size: 0 I0622 09:21:51.518535 269 memory_optimize_pass.cc:216] Cluster name : linear_1.tmp_1 size: 8 --- Running analysis [ir_graph_to_program_pass] I0622 09:21:51.569360 269 analysis_predictor.cc:1007] ======= optimize end ======= I0622 09:21:51.572525 269 naive_executor.cc:102] --- skip [feed], feed -> x I0622 09:21:51.573186 269 naive_executor.cc:102] --- skip [save_infer_model/scale_0.tmp_1], fetch -> fetch I0622 09:21:51.682315 269 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0622 09:21:51.725394 269 fuse_pass_base.cc:57] --- detected 2 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0622 09:21:51.788177 269 fuse_pass_base.cc:57] --- detected 184 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] I0622 09:21:51.797163 269 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0622 09:21:51.807322 269 fuse_pass_base.cc:57] --- detected 19 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] I0622 09:21:51.811800 269 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] I0622 09:21:51.812944 269 fuse_pass_base.cc:57] --- detected 4 subgraphs --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] I0622 09:21:51.816274 269 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0622 09:21:51.821717 269 fuse_pass_base.cc:57] --- detected 23 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0622 09:21:51.829370 269 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 80 nodes I0622 09:21:51.839471 269 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0622 09:21:51.854321 269 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0622 09:22:04.607172 269 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0622 09:22:04.626668 269 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0622 09:22:04.636153 269 memory_optimize_pass.cc:216] Cluster name : transpose_11.tmp_1 size: 0 I0622 09:22:04.636188 269 memory_optimize_pass.cc:216] Cluster name : pool2d_5.tmp_0_clone_0 size: 2048 I0622 09:22:04.636200 269 memory_optimize_pass.cc:216] Cluster name : transpose_10.tmp_0_slice_2 size: 480 I0622 09:22:04.636214 269 memory_optimize_pass.cc:216] Cluster name : linear_43.tmp_1 size: 26500 I0622 09:22:04.636233 269 memory_optimize_pass.cc:216] Cluster name : swish_25.tmp_0 size: 2048 I0622 09:22:04.636251 269 memory_optimize_pass.cc:216] Cluster name : tmp_8 size: 480 I0622 09:22:04.636267 269 memory_optimize_pass.cc:216] Cluster name : tmp_10 size: 480 --- Running analysis [ir_graph_to_program_pass] I0622 09:22:04.690970 269 analysis_predictor.cc:1007] ======= optimize end ======= I0622 09:22:04.695168 269 naive_executor.cc:102] --- skip [feed], feed -> x I0622 09:22:04.696712 269 naive_executor.cc:102] --- skip [softmax_5.tmp_0], fetch -> fetch (1080, 1440, 3) [2022/06/22 09:22:04] ppocr WARNING: Since the angle classifier is not initialized, the angle classifier will not be uesd during the forward process W0622 09:22:04.775489 269 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.2, Runtime API Version: 10.2 W0622 09:22:04.776046 269 gpu_context.cc:306] device: 0, cuDNN Version: 8.4. Traceback (most recent call last): File "1.py", line 24, in result = addleOCR.ocr(frame) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 474, in ocr dt_boxes, rec_res = self.call(img, cls) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 69, in call dt_boxes, elapse = self.text_detector(img) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_det.py", line 242, in call post_result = self.postprocess_op(preds, shape_list) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/postprocess/db_postprocess.py", line 176, in call pred = pred[:, 0, :, :] IndexError: too many indices for array: array is 2-dimensional, but 4 were indexed

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

这里的pred应该是一个四维的数组,你打印shape看下 https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像?

LDOUBLEV avatar Jun 22 '22 09:06 LDOUBLEV

这里的pred应该是一个四维的数组,你打印shape看下

https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像?

公司的邮箱 无法外发图片,不清楚是否还有别的方式给你图片

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

这里的pred应该是一个四维的数组,你打印shape看下 https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像?

公司的邮箱 无法外发图片,不清楚是否还有别的方式给你图片

image

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

这里的pred应该是一个四维的数组,你打印shape看下 https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像?

公司的邮箱 无法外发图片,不清楚是否还有别的方式给你图片

image

image

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

这里的pred应该是一个四维的数组,你打印shape看下 https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像?

公司的邮箱 无法外发图片,不清楚是否还有别的方式给你图片

这里的pred应该是一个四维的数组,你打印shape看下

https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/ppocr/postprocess/db_postprocess.py#L173

是否方便发一下你的测试图像? 打印了pred的值 image

[[0.05592113 0.94407886]] (1, 2)

shihaitao118 avatar Jun 22 '22 09:06 shihaitao118

你是否修改了代码,我这边测试结果没问题 image

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

你是否修改了代码,我这边测试结果没问题 image 代码修改了,我使用了tensorrt , image

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

你是否修改了代码,我这边测试结果没问题 image

image

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

代码修改了,我使用了tensorrt ,

关闭tensorrt预测是否正常呢?

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

代码修改了,我使用了tensorrt ,

关闭tensorrt预测是否正常呢?

image 关闭tensorrt也不正常了,之前是可以的

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

重新安装paddleocr吧,加上这一行https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L279:我开启TRT预测也没问题了; image

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

代码修改了,我使用了tensorrt ,

关闭tensorrt预测是否正常呢?

paddlepaddle-gpu 2.3.0 是否是这个包有问题,我是在https://www.paddlepaddle.org.cn/inference/v2.3/user_guides/download_lib.html#python 上下载的 image

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

不是paddle的问题,我安装了你用的paddle,预测还是正常的: image

你先卸载后重新安装;

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

重新安装paddleocr吧,加上这一行https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L279:我开启TRT预测也没问题了; image

重新安装了还是不行

重新搭了一个paddle的环境, paddle-bfloat 0.1.2 paddle2onnx 0.5 paddlehub 1.8.3 paddleocr 2.5 paddlepaddle-gpu 2.3.0 paddleslim 1.1.1 paddlex 1.3.7

tensorrt 7.0.0.11 加了您说的那一行代码后,tensorrt可以跑通了,测试时间,发现速度并没有提高,不清楚是为啥

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

TRT对检测模型有20-30%的加速,对识别模型加速不明显;

预测第一张图的话,由于trt需要初始化,所以耗时比较久;如果初始化完predictor后,连续预测会快很多;你可以把--image_dir设置为图像文件夹目录,观察连续预测的预测时间

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

TRT对检测模型有20-30%的加速,对识别模型加速不明显;

预测第一张图的话,由于trt需要初始化,所以耗时比较久;如果初始化完predictor后,连续预测会快很多;你可以把--image_dir设置为图像文件夹目录,观察连续预测的预测时间

image 我是循环推理了一张图片,打印了时间,和不加速的对比时间基本相同

image

您那边说的检测和识别分别指的是哪个模型啊?代码中如何体现我使用的是哪个模型啊?

shihaitao118 avatar Jun 22 '22 11:06 shihaitao118

另外,在开启TRT预测不报错的情况下,尽量设置小的min_subgraph_size稍微有一些加速: https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L40

您那边说的检测和识别分别指的是哪个模型啊?代码中如何体现我使用的是哪个模型啊?

det_model_dir指向的是文本检测模型; rec_model_dir指向的是文本识别模型;

LDOUBLEV avatar Jun 22 '22 11:06 LDOUBLEV

另外,在开启TRT预测不报错的情况下,尽量设置小的min_subgraph_size稍微有一些加速:

https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/tools/infer/utility.py#L40

您那边说的检测和识别分别指的是哪个模型啊?代码中如何体现我使用的是哪个模型啊?

det_model_dir指向的是文本检测模型; rec_model_dir指向的是文本识别模型;

min_subgraph_size设置低于15后,tensorrt加速均会报错 det_model_dir指向的是文本检测模型; rec_model_dir指向的是文本识别模型; 目前我写的ocr识别中这两个模型都用到了,按理速度应该有提升20%~30%?

想请问一下,如果我想单独使用这个两个模型进行检测推理,是否有样例代码,我刚刚测试了,将det_model_dir 或rec_model_dir 去除,ocr都会主动去下载模型

shihaitao118 avatar Jun 22 '22 12:06 shihaitao118

参考这个文档:https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/whl.md和代码https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/paddleocr.py#L445吧 单独使用模型的时候,只设置det_model_dir或者rec_model_dir的路径即可;如果只用检测,设置ocr(self, img, det=True, rec=False, cls=False)

目前我写的ocr识别中这两个模型都用到了,按理速度应该有提升20%~30%?

不是的,两个模型都用,只有检测预测部分会加速,检测预处理、后处理,检测框可视化和识别预测部分都没加速的。所以整体加速不明显

LDOUBLEV avatar Jun 22 '22 12:06 LDOUBLEV

https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/paddleocr.py#L445

好的,感谢解答

shihaitao118 avatar Jun 22 '22 12:06 shihaitao118

参考这个文档:https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/whl.md和代码https://github.com/PaddlePaddle/PaddleOCR/blob/0b34ad5b93f19cf2324308e52bee59daeb164c4d/paddleocr.py#L445吧 单独使用模型的时候,只设置det_model_dir或者rec_model_dir的路径即可;如果只用检测,设置ocr(self, img, det=True, rec=False, cls=False)

目前我写的ocr识别中这两个模型都用到了,按理速度应该有提升20%~30%?

不是的,两个模型都用,只有检测预测部分会加速,检测预处理、后处理,检测框可视化和识别预测部分都没加速的。所以整体加速不明显

最后想请教一下您,在使用tensorrt加速过程中 ,paddle是否会生成det_model,rec_model,cls_model这三个模型的engine模型(转换的trt模型)?如果有临时的这个文件,是在哪里目录下啊?我之前在做测试的时候,有看到过日志中有打印engine模型的路径,不知道是不是转换的trt模型文件

shihaitao118 avatar Jun 22 '22 12:06 shihaitao118

image 这个里面所说的动态shape ,使用Netron打开网络结构后,搜索对应的参数,请问一下,这个动态shape可以在哪里查看啊 image

shihaitao118 avatar Jun 22 '22 12:06 shihaitao118

image 这个里面所说的动态shape ,使用Netron打开网络结构后,搜索对应的参数,请问一下,这个动态shape可以在哪里查看啊 image

image

shihaitao118 avatar Jun 22 '22 12:06 shihaitao118

@LDOUBLEV 您好,vs2019 C++本地部署PaddleOCR,是不是只有dygraph支持开启TensorRT加速??在release/2.5分支下部署,也遇到了这个问题?? 求指导谢谢

W0624 09:49:44.383039  8016 helper.h:107] TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.3.1
W0624 09:49:44.445314  8016 helper.h:107] TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.3.1
I0624 09:49:44.447310  8016 engine.cc:424] Inspector needs TensorRT version 8.2 and after.
I0624 09:49:44.453294  8016 tensorrt_subgraph_pass.cc:141] ---  detect a sub-graph with 5 nodes
I0624 09:49:44.453294  8016 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I0624 09:49:44.453294  8016 op_converter.h:253] trt input [lstm_1.tmp_0] dynamic shape info not set, please check and retry.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
Not support stack backtrace yet.

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: some trt inputs dynamic shape info not set, check the INFO log above for more details.
  [Hint: Expected all_dynamic_shape_set == true, but received all_dynamic_shape_set:0 != true:1.] (at ..\paddle/fluid/inference/tensorrt/convert/op_converter.h:287)

模型使用 ch_PP-OCRv2_det_infer ch_PP-OCRv2_rec_infer, 查看问题3890 ,推理模型结构,没有搜索conv2d_124,这个参数的shape。

chccc1994 avatar Jun 24 '22 00:06 chccc1994

image 根据报错信息,参考FAQ设置下:ttps://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/FAQ.md#q-trt%E9%A2%84%E6%B5%8B%E6%8A%A5%E9%94%99invalidargumenterror-some-trt-inputs-dynamic-shape-info-not-set-check-the-info-log-above-for-more-details

LDOUBLEV avatar Jun 24 '22 03:06 LDOUBLEV

@shihaitao118 在netron.app中可视化之后,可以在观察到shape信息: image

LDOUBLEV avatar Jun 24 '22 03:06 LDOUBLEV