PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

ppocrv4蒸馏模型训练报错

Open QQQTAO opened this issue 1 year ago • 9 comments

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:paddlepaddle-gpu 2.3.2.post112
  • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:paddleocr2.7
  • 运行指令/Command Code:python -m paddle.distributed.launch --log_dir=./debug --gpus '0,1,3,4' tools/train.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_distill.yml
  • 完整报错/Complete Error Message:INFO 2023-08-11 10:10:52,331 launch_utils.py:563] details about PADDLE_TRAINER_ENDPOINTS can be found in ./debug/centerloss/endpoints.log, and detail running logs maybe found in ./debug/centerloss/workerlog.0 INFO 2023-08-11 10:10:52,331 launch_utils.py:563] details about PADDLE_TRAINER_ENDPOINTS can be found in ./debug/centerloss/endpoints.log, and detail running logs maybe found in ./debug/centerloss/workerlog.0 launch proc_id:22211 idx:0 launch proc_id:22216 idx:1 launch proc_id:22221 idx:2 launch proc_id:22229 idx:3 [2023/08/11 10:10:54] ppocr INFO: Architecture : [2023/08/11 10:10:54] ppocr INFO: Models : [2023/08/11 10:10:54] ppocr INFO: Student : [2023/08/11 10:10:54] ppocr INFO: Backbone : [2023/08/11 10:10:54] ppocr INFO: name : PPLCNetV3 [2023/08/11 10:10:54] ppocr INFO: scale : 0.95 [2023/08/11 10:10:54] ppocr INFO: Head : [2023/08/11 10:10:54] ppocr INFO: head_list : [2023/08/11 10:10:54] ppocr INFO: CTCHead : [2023/08/11 10:10:54] ppocr INFO: Head : [2023/08/11 10:10:54] ppocr INFO: fc_decay : 1e-05 [2023/08/11 10:10:54] ppocr INFO: Neck : [2023/08/11 10:10:54] ppocr INFO: depth : 2 [2023/08/11 10:10:54] ppocr INFO: dims : 120 [2023/08/11 10:10:54] ppocr INFO: hidden_dims : 120 [2023/08/11 10:10:54] ppocr INFO: kernel_size : [1, 3] [2023/08/11 10:10:54] ppocr INFO: name : svtr [2023/08/11 10:10:54] ppocr INFO: use_guide : True [2023/08/11 10:10:54] ppocr INFO: NRTRHead : [2023/08/11 10:10:54] ppocr INFO: max_text_length : 25 [2023/08/11 10:10:54] ppocr INFO: nrtr_dim : 384 [2023/08/11 10:10:54] ppocr INFO: name : MultiHead [2023/08/11 10:10:54] ppocr INFO: Transform : None [2023/08/11 10:10:54] ppocr INFO: algorithm : SVTR [2023/08/11 10:10:54] ppocr INFO: freeze_params : False [2023/08/11 10:10:54] ppocr INFO: model_type : rec [2023/08/11 10:10:54] ppocr INFO: pretrained : None [2023/08/11 10:10:54] ppocr INFO: return_all_feats : True [2023/08/11 10:10:54] ppocr INFO: Teacher : [2023/08/11 10:10:54] ppocr INFO: Backbone : [2023/08/11 10:10:54] ppocr INFO: depth : [3, 6, 3] [2023/08/11 10:10:54] ppocr INFO: embed_dim : [64, 128, 256] [2023/08/11 10:10:54] ppocr INFO: img_size : [48, 320] [2023/08/11 10:10:54] ppocr INFO: last_stage : False [2023/08/11 10:10:54] ppocr INFO: local_mixer : [[5, 5], [5, 5], [5, 5]] [2023/08/11 10:10:54] ppocr INFO: mixer : ['Conv', 'Conv', 'Conv', 'Conv', 'Conv', 'Conv', 'Global', 'Global', 'Global', 'Global', 'Global', 'Global'] [2023/08/11 10:10:54] ppocr INFO: name : SVTRNet [2023/08/11 10:10:54] ppocr INFO: num_heads : [2, 4, 8] [2023/08/11 10:10:54] ppocr INFO: out_channels : 192 [2023/08/11 10:10:54] ppocr INFO: out_char_num : 40 [2023/08/11 10:10:54] ppocr INFO: patch_merging : Conv [2023/08/11 10:10:54] ppocr INFO: prenorm : True [2023/08/11 10:10:54] ppocr INFO: Head : [2023/08/11 10:10:54] ppocr INFO: head_list : [2023/08/11 10:10:54] ppocr INFO: CTCHead : [2023/08/11 10:10:54] ppocr INFO: Head : [2023/08/11 10:10:54] ppocr INFO: fc_decay : 1e-05 [2023/08/11 10:10:54] ppocr INFO: Neck : [2023/08/11 10:10:54] ppocr INFO: depth : 2 [2023/08/11 10:10:54] ppocr INFO: dims : 120 [2023/08/11 10:10:54] ppocr INFO: hidden_dims : 120 [2023/08/11 10:10:54] ppocr INFO: kernel_size : [1, 3] [2023/08/11 10:10:54] ppocr INFO: name : svtr [2023/08/11 10:10:54] ppocr INFO: use_guide : True [2023/08/11 10:10:54] ppocr INFO: NRTRHead : [2023/08/11 10:10:54] ppocr INFO: max_text_length : 25 [2023/08/11 10:10:54] ppocr INFO: nrtr_dim : 384 [2023/08/11 10:10:54] ppocr INFO: name : MultiHead [2023/08/11 10:10:54] ppocr INFO: Transform : None [2023/08/11 10:10:54] ppocr INFO: algorithm : SVTR [2023/08/11 10:10:54] ppocr INFO: freeze_params : True [2023/08/11 10:10:54] ppocr INFO: model_type : rec [2023/08/11 10:10:54] ppocr INFO: pretrained : None [2023/08/11 10:10:54] ppocr INFO: return_all_feats : True [2023/08/11 10:10:54] ppocr INFO: algorithm : Distillation [2023/08/11 10:10:54] ppocr INFO: model_type : rec [2023/08/11 10:10:54] ppocr INFO: name : DistillationModel [2023/08/11 10:10:54] ppocr INFO: Eval : [2023/08/11 10:10:54] ppocr INFO: dataset : [2023/08/11 10:10:54] ppocr INFO: data_dir : ./train_data [2023/08/11 10:10:54] ppocr INFO: label_file_list : ['/home/qintao/ocr_data/real_data/new_label_test.txt'] [2023/08/11 10:10:54] ppocr INFO: name : SimpleDataSet [2023/08/11 10:10:54] ppocr INFO: transforms : [2023/08/11 10:10:54] ppocr INFO: DecodeImage : [2023/08/11 10:10:54] ppocr INFO: channel_first : False [2023/08/11 10:10:54] ppocr INFO: img_mode : BGR [2023/08/11 10:10:54] ppocr INFO: MultiLabelEncode : [2023/08/11 10:10:54] ppocr INFO: gtc_encode : NRTRLabelEncode [2023/08/11 10:10:54] ppocr INFO: RecResizeImg : [2023/08/11 10:10:54] ppocr INFO: image_shape : [3, 48, 320] [2023/08/11 10:10:54] ppocr INFO: KeepKeys : [2023/08/11 10:10:54] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio'] [2023/08/11 10:10:54] ppocr INFO: loader : [2023/08/11 10:10:54] ppocr INFO: batch_size_per_card : 128 [2023/08/11 10:10:54] ppocr INFO: drop_last : False [2023/08/11 10:10:54] ppocr INFO: num_workers : 4 [2023/08/11 10:10:54] ppocr INFO: shuffle : False [2023/08/11 10:10:54] ppocr INFO: Global : [2023/08/11 10:10:54] ppocr INFO: cal_metric_during_train : True [2023/08/11 10:10:54] ppocr INFO: character_dict_path : ppocr/utils/ppocr_keys_v1.txt [2023/08/11 10:10:54] ppocr INFO: checkpoints : ./pretrain_model/ch_PP-OCRv4_rec_train/student.pdparams [2023/08/11 10:10:54] ppocr INFO: debug : False [2023/08/11 10:10:54] ppocr INFO: distributed : True [2023/08/11 10:10:54] ppocr INFO: epoch_num : 200 [2023/08/11 10:10:54] ppocr INFO: eval_batch_step : [0, 2000] [2023/08/11 10:10:54] ppocr INFO: infer_img : doc/imgs_words/ch/word_1.jpg [2023/08/11 10:10:54] ppocr INFO: infer_mode : False [2023/08/11 10:10:54] ppocr INFO: log_smooth_window : 20 [2023/08/11 10:10:54] ppocr INFO: max_text_length : 25 [2023/08/11 10:10:54] ppocr INFO: pretrained_model : None [2023/08/11 10:10:54] ppocr INFO: print_batch_step : 10 [2023/08/11 10:10:54] ppocr INFO: save_epoch_step : 40 [2023/08/11 10:10:54] ppocr INFO: save_inference_dir : None [2023/08/11 10:10:54] ppocr INFO: save_model_dir : ./output/rec_dkd_400w_svtr_ctc_lcnet_blank_dkd0.1/ [2023/08/11 10:10:54] ppocr INFO: save_res_path : ./output/rec/predicts_ppocrv3.txt [2023/08/11 10:10:54] ppocr INFO: use_gpu : True [2023/08/11 10:10:54] ppocr INFO: use_space_char : True [2023/08/11 10:10:54] ppocr INFO: use_visualdl : False [2023/08/11 10:10:54] ppocr INFO: Loss : [2023/08/11 10:10:54] ppocr INFO: loss_config_list : [2023/08/11 10:10:54] ppocr INFO: DistillationDKDLoss : [2023/08/11 10:10:54] ppocr INFO: alpha : 1.0 [2023/08/11 10:10:54] ppocr INFO: beta : 2.0 [2023/08/11 10:10:54] ppocr INFO: dis_head : gtc [2023/08/11 10:10:54] ppocr INFO: key : head_out [2023/08/11 10:10:54] ppocr INFO: model_name_pairs : [['Student', 'Teacher']] [2023/08/11 10:10:54] ppocr INFO: multi_head : True [2023/08/11 10:10:54] ppocr INFO: name : dkd [2023/08/11 10:10:54] ppocr INFO: weight : 0.1 [2023/08/11 10:10:54] ppocr INFO: DistillationCTCLoss : [2023/08/11 10:10:54] ppocr INFO: key : head_out [2023/08/11 10:10:54] ppocr INFO: model_name_list : ['Student'] [2023/08/11 10:10:54] ppocr INFO: multi_head : True [2023/08/11 10:10:54] ppocr INFO: weight : 1.0 [2023/08/11 10:10:54] ppocr INFO: DistillationNRTRLoss : [2023/08/11 10:10:54] ppocr INFO: key : head_out [2023/08/11 10:10:54] ppocr INFO: model_name_list : ['Student'] [2023/08/11 10:10:54] ppocr INFO: multi_head : True [2023/08/11 10:10:54] ppocr INFO: smoothing : False [2023/08/11 10:10:54] ppocr INFO: weight : 1.0 [2023/08/11 10:10:54] ppocr INFO: DistillCTCLogits : [2023/08/11 10:10:54] ppocr INFO: key : head_out [2023/08/11 10:10:54] ppocr INFO: model_name_pairs : [['Student', 'Teacher']] [2023/08/11 10:10:54] ppocr INFO: reduction : mean [2023/08/11 10:10:54] ppocr INFO: weight : 1.0 [2023/08/11 10:10:54] ppocr INFO: name : CombinedLoss [2023/08/11 10:10:54] ppocr INFO: Metric : [2023/08/11 10:10:54] ppocr INFO: base_metric_name : RecMetric [2023/08/11 10:10:54] ppocr INFO: ignore_space : False [2023/08/11 10:10:54] ppocr INFO: key : Student [2023/08/11 10:10:54] ppocr INFO: main_indicator : acc [2023/08/11 10:10:54] ppocr INFO: name : DistillationMetric [2023/08/11 10:10:54] ppocr INFO: Optimizer : [2023/08/11 10:10:54] ppocr INFO: beta1 : 0.9 [2023/08/11 10:10:54] ppocr INFO: beta2 : 0.999 [2023/08/11 10:10:54] ppocr INFO: lr : [2023/08/11 10:10:54] ppocr INFO: learning_rate : 0.0005 [2023/08/11 10:10:54] ppocr INFO: name : Cosine [2023/08/11 10:10:54] ppocr INFO: warmup_epoch : 2 [2023/08/11 10:10:54] ppocr INFO: name : Adam [2023/08/11 10:10:54] ppocr INFO: regularizer : [2023/08/11 10:10:54] ppocr INFO: factor : 3e-05 [2023/08/11 10:10:54] ppocr INFO: name : L2 [2023/08/11 10:10:54] ppocr INFO: PostProcess : [2023/08/11 10:10:54] ppocr INFO: key : head_out [2023/08/11 10:10:54] ppocr INFO: model_name : ['Student'] [2023/08/11 10:10:54] ppocr INFO: multi_head : True [2023/08/11 10:10:54] ppocr INFO: name : DistillationCTCLabelDecode [2023/08/11 10:10:54] ppocr INFO: Train : [2023/08/11 10:10:54] ppocr INFO: dataset : [2023/08/11 10:10:54] ppocr INFO: data_dir : ./train_data/ [2023/08/11 10:10:54] ppocr INFO: label_file_list : ['/home/qintao/ocr_data/real_data/new_label_train.txt', '/home/qintao/ocr_data/RecData2/new_labels_gen.txt', '/home/qintao/ocr_data/RecData2/hor_white/image_labels.txt'] [2023/08/11 10:10:54] ppocr INFO: name : SimpleDataSet [2023/08/11 10:10:54] ppocr INFO: ratio_list : [1.0, 0.1, 1.0] [2023/08/11 10:10:54] ppocr INFO: transforms : [2023/08/11 10:10:54] ppocr INFO: DecodeImage : [2023/08/11 10:10:54] ppocr INFO: channel_first : False [2023/08/11 10:10:54] ppocr INFO: img_mode : BGR [2023/08/11 10:10:54] ppocr INFO: RecAug : None [2023/08/11 10:10:54] ppocr INFO: MultiLabelEncode : [2023/08/11 10:10:54] ppocr INFO: gtc_encode : NRTRLabelEncode [2023/08/11 10:10:54] ppocr INFO: KeepKeys : [2023/08/11 10:10:54] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio'] [2023/08/11 10:10:54] ppocr INFO: loader : [2023/08/11 10:10:54] ppocr INFO: batch_size_per_card : 128 [2023/08/11 10:10:54] ppocr INFO: drop_last : True [2023/08/11 10:10:54] ppocr INFO: num_workers : 4 [2023/08/11 10:10:54] ppocr INFO: shuffle : True [2023/08/11 10:10:54] ppocr INFO: use_shared_memory : True [2023/08/11 10:10:54] ppocr INFO: profiler_options : None [2023/08/11 10:10:54] ppocr INFO: train with paddle 2.3.2 and device Place(gpu:0) I0811 10:10:54.980728 22211 nccl_context.cc:83] init nccl context nranks: 4 local rank: 0 gpu id: 0 ring id: 0 W0811 10:10:56.348552 22211 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.2, Runtime API Version: 11.2 W0811 10:10:56.353502 22211 gpu_resources.cc:91] device: 0, cuDNN Version: 8.1. [2023/08/11 10:10:57] ppocr INFO: Initialize indexs of datasets:['/home/qintao/ocr_data/real_data/new_label_train.txt', '/home/qintao/ocr_data/RecData2/new_labels_gen.txt', '/home/qintao/ocr_data/RecData2/hor_white/image_labels.txt'] list index out of range [2023/08/11 10:10:59] ppocr INFO: Initialize indexs of datasets:['/home/qintao/ocr_data/real_data/new_label_test.txt'] Traceback (most recent call last): File "tools/train.py", line 227, in main(config, device, logger, vdl_writer) File "tools/train.py", line 135, in main model = build_model(config['Architecture']) File "/home/qintao/PaddleOCRV4/ppocr/modeling/architectures/init.py", line 34, in build_model arch = getattr(mod, name)(config) File "/home/qintao/PaddleOCRV4/ppocr/modeling/architectures/distillation_model.py", line 47, in init model = BaseModel(model_config) File "/home/qintao/PaddleOCRV4/ppocr/modeling/architectures/base_model.py", line 76, in init self.head = build_head(config["Head"]) File "/home/qintao/PaddleOCRV4/ppocr/modeling/heads/init.py", line 71, in build_head module_class = eval(module_name)(**config) File "/home/qintao/PaddleOCRV4/ppocr/modeling/heads/rec_multi_head.py", line 74, in init out_channels=out_channels_list['NRTRLabelDecode']) KeyError: 'NRTRLabelDecode' 使用代码中的配置文件只修改了数据集配置和最大字符长度,出现这种错误,调试的时候也发现报错的这个配置选项

QQQTAO avatar Aug 11 '23 02:08 QQQTAO