PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

很多行文字无法识别出来,是finetune还是简单修改下参数即可

Open sticktoFE opened this issue 2 years ago • 1 comments

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem seg8

  • 系统环境/System Environment:window11
  • 版本号/Version:Paddle:gpu最新版 PaddleOCR:最新版
  • 运行指令/Command Code: 1、初始化参数: self.ocr_structure_sys = PPStructure( # 1、版面分析模型路径 layout_model_dir=str(dirName.joinpath("ocr", "inference",'picodet_lcnet_x1_0_fgd_layout_cdla_infer')), layout_dict_path=str(dirName.joinpath("ocr", "inference",'layout_cdla_dict.txt')), # 2、ocr文字区域检测模型路径 det_model_dir=str(dirName.joinpath("ocr", "inference", "ch_PP-OCRv3_det_infer")), # 2.1 ocr文字区域方向检测及矫正,不清楚ppstructure是不是需要?? # use_angle_cls=True, # image_orientation = True, #打开会报错,原因待查 # cls_model_dir=str(dirName.joinpath("ocr", "inference", 'ch_ppocr_mobile_v2.0_cls_infer')), # 3、ocr文字区域识别模型路径 rec_model_dir=str(dirName.joinpath("ocr", "inference", 'ch_PP-OCRv3_rec_infer')), rec_char_dict_path=str(dirName.joinpath("ocr", "inference", 'ppocr_keys_v1.txt')), # 4、表格区域识别模型路径(下面二选一,好像第一个性能较差??) # table_algorithm='TableMaster', # table_model_dir=str(dirName.joinpath("ocr", "inference", 'table_structure_tablemaster_infer')), table_model_dir=str(dirName.joinpath("ocr", "inference", 'ch_ppstructure_mobile_v2.0_SLANet_infer')), # 表格识别字典,如果更换为中文模型,不需要更换字典(下面两个要不要更换还没研究出来???) # table_char_dict_path=str(dirName.joinpath("ocr", "inference", 'table_structure_dict.txt')), table_char_dict_path=str(dirName.joinpath("ocr", "inference", 'table_structure_dict_ch.txt')), # 5、版面恢复 recovery = True, #是否进行版面恢复,默认False vis_font_path=str(dirName.joinpath("ocr", "inference", 'simfang.ttf')), lang='ch', mode='structure', # 性能控制区 enable_mkldnn=True, use_gpu = True, gpu_mem = 6000, use_mp=True, total_process_num=2, use_trt=True, # use_tensorrt=True, # 报错,暂放这 use_onnx=False, use_space_char=True, layout_label_map= {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # det_max_side_len =10000, # max_text_length = 20000, #识别算法能识别的最大文字长度 # 默认参数是det_limit_type='max', det_limit_side_len=960,也就是长边最大960. # det_limit_type='min', det_limit_side_len=32, #短边最小64. show_log=True) 2、运行参数,把长图片截成短图片,识别后再拼装 h,w,d = self.img_cv2.shape if h >= 640: # 以640为单位 把长图片分成小图片 seg_num = math.ceil(h/640) for i in range(seg_num): img_cv2_seg= self.img_cv2[i*640:(i+1)*640,:,:] # 一小段一小段识别 resize_img_cv2,out_info,html_content = self.ocr_structure_sys(img_cv2_seg) ......
  • 完整报错/Complete Error Message:

sticktoFE avatar Sep 21 '22 04:09 sticktoFE

检测框小了吧。有个det_db_unclip_ratio的参数,调大点试试。

wangyake avatar Sep 23 '22 05:09 wangyake

utility中把DB的阈值调整一下,检测的信息会更完全

Anjou-YES avatar Dec 01 '22 09:12 Anjou-YES

亲测有效

Anjou-YES avatar Dec 01 '22 09:12 Anjou-YES