PaddleOCR
PaddleOCR copied to clipboard
v4 en-rec模型微调loss为0
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
- 系统环境/System Environment:win10
- 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:ppocr2.7
- 运行指令/Command Code:D:\PYTHON\Anaconda3\envs\pd\python.exe D:\PYTHON\codes\PaddleOCR-release-2.7\train-muhao.py -c .\configs\rec\PP-OCRv4\en_PP-OCRv4_rec.yml -o Global.pretrained_model=./pre_train_models/en_PP-OCRv4_rec_train/best_accuracy
- 完整报错/Complete Error Message:训练几个epoch acc降为0,其他参数变为nanxx
我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):yes
请尽量不要包含图片在问题中/Please try to not include the image in the issue.
D:\PYTHON\Anaconda3\envs\pd\python.exe D:\PYTHON\codes\PaddleOCR-release-2.7\train.py -c .\configs\rec\PP-OCRv4\en_PP-OCRv4_rec.yml -o Global.pretrained_model=./pre_train_models/en_PP-OCRv4_rec_train/best_accuracy [2024/02/26 22:10:06] ppocr INFO: Architecture : [2024/02/26 22:10:06] ppocr INFO: Backbone : [2024/02/26 22:10:06] ppocr INFO: name : PPLCNetV3 [2024/02/26 22:10:06] ppocr INFO: scale : 0.95 [2024/02/26 22:10:06] ppocr INFO: Head : [2024/02/26 22:10:06] ppocr INFO: head_list : [2024/02/26 22:10:06] ppocr INFO: CTCHead : [2024/02/26 22:10:06] ppocr INFO: Head : [2024/02/26 22:10:06] ppocr INFO: fc_decay : 1e-05 [2024/02/26 22:10:06] ppocr INFO: Neck : [2024/02/26 22:10:06] ppocr INFO: depth : 2 [2024/02/26 22:10:06] ppocr INFO: dims : 120 [2024/02/26 22:10:06] ppocr INFO: hidden_dims : 120 [2024/02/26 22:10:06] ppocr INFO: kernel_size : [1, 3] [2024/02/26 22:10:06] ppocr INFO: name : svtr [2024/02/26 22:10:06] ppocr INFO: use_guide : True [2024/02/26 22:10:06] ppocr INFO: NRTRHead : [2024/02/26 22:10:06] ppocr INFO: max_text_length : 10 [2024/02/26 22:10:06] ppocr INFO: nrtr_dim : 384 [2024/02/26 22:10:06] ppocr INFO: name : MultiHead [2024/02/26 22:10:06] ppocr INFO: Transform : None [2024/02/26 22:10:06] ppocr INFO: algorithm : SVTR_LCNet [2024/02/26 22:10:06] ppocr INFO: model_type : rec [2024/02/26 22:10:06] ppocr INFO: Eval : [2024/02/26 22:10:06] ppocr INFO: dataset : [2024/02/26 22:10:06] ppocr INFO: data_dir : D:\PYTHON\pictures [2024/02/26 22:10:06] ppocr INFO: label_file_list : ['img\rec_eval.txt'] [2024/02/26 22:10:06] ppocr INFO: name : SimpleDataSet [2024/02/26 22:10:06] ppocr INFO: transforms : [2024/02/26 22:10:06] ppocr INFO: DecodeImage : [2024/02/26 22:10:06] ppocr INFO: channel_first : False [2024/02/26 22:10:06] ppocr INFO: img_mode : BGR [2024/02/26 22:10:06] ppocr INFO: MultiLabelEncode : [2024/02/26 22:10:06] ppocr INFO: gtc_encode : NRTRLabelEncode [2024/02/26 22:10:06] ppocr INFO: RecResizeImg : [2024/02/26 22:10:06] ppocr INFO: image_shape : [3, 48, 96] [2024/02/26 22:10:06] ppocr INFO: KeepKeys : [2024/02/26 22:10:06] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio'] [2024/02/26 22:10:06] ppocr INFO: loader : [2024/02/26 22:10:06] ppocr INFO: batch_size_per_card : 128 [2024/02/26 22:10:06] ppocr INFO: drop_last : False [2024/02/26 22:10:06] ppocr INFO: num_workers : 1 [2024/02/26 22:10:06] ppocr INFO: shuffle : False [2024/02/26 22:10:06] ppocr INFO: Global : [2024/02/26 22:10:06] ppocr INFO: cal_metric_during_train : True [2024/02/26 22:10:06] ppocr INFO: character_dict_path : ppocr/utils/en_dict.txt [2024/02/26 22:10:06] ppocr INFO: checkpoints : None [2024/02/26 22:10:06] ppocr INFO: debug : False [2024/02/26 22:10:06] ppocr INFO: distributed : False [2024/02/26 22:10:06] ppocr INFO: epoch_num : 50 [2024/02/26 22:10:06] ppocr INFO: eval_batch_step : [0, 2000] [2024/02/26 22:10:06] ppocr INFO: infer_img : doc/imgs_words/ch/word_1.jpg [2024/02/26 22:10:06] ppocr INFO: infer_mode : False [2024/02/26 22:10:06] ppocr INFO: log_smooth_window : 20 [2024/02/26 22:10:06] ppocr INFO: max_text_length : 10 [2024/02/26 22:10:06] ppocr INFO: pretrained_model : ./pre_train_models/en_PP-OCRv4_rec_train/best_accuracy [2024/02/26 22:10:06] ppocr INFO: print_batch_step : 10 [2024/02/26 22:10:06] ppocr INFO: save_epoch_step : 10 [2024/02/26 22:10:06] ppocr INFO: save_inference_dir : None [2024/02/26 22:10:06] ppocr INFO: save_model_dir : ./output/rec_ppocr_v4 [2024/02/26 22:10:06] ppocr INFO: save_res_path : ./output/rec/predicts_ppocrv3.txt [2024/02/26 22:10:06] ppocr INFO: use_gpu : True [2024/02/26 22:10:06] ppocr INFO: use_space_char : True [2024/02/26 22:10:06] ppocr INFO: use_visualdl : False [2024/02/26 22:10:06] ppocr INFO: Loss : [2024/02/26 22:10:06] ppocr INFO: loss_config_list : [2024/02/26 22:10:06] ppocr INFO: CTCLoss : None [2024/02/26 22:10:06] ppocr INFO: NRTRLoss : None [2024/02/26 22:10:06] ppocr INFO: name : MultiLoss [2024/02/26 22:10:06] ppocr INFO: Metric : [2024/02/26 22:10:06] ppocr INFO: ignore_space : False [2024/02/26 22:10:06] ppocr INFO: main_indicator : acc [2024/02/26 22:10:06] ppocr INFO: name : RecMetric [2024/02/26 22:10:06] ppocr INFO: Optimizer : [2024/02/26 22:10:06] ppocr INFO: beta1 : 0.9 [2024/02/26 22:10:06] ppocr INFO: beta2 : 0.999 [2024/02/26 22:10:06] ppocr INFO: lr : [2024/02/26 22:10:06] ppocr INFO: learning_rate : 0.0005 [2024/02/26 22:10:06] ppocr INFO: name : Cosine [2024/02/26 22:10:06] ppocr INFO: warmup_epoch : 5 [2024/02/26 22:10:06] ppocr INFO: name : Adam [2024/02/26 22:10:06] ppocr INFO: regularizer : [2024/02/26 22:10:06] ppocr INFO: factor : 3e-05 [2024/02/26 22:10:06] ppocr INFO: name : L2 [2024/02/26 22:10:06] ppocr INFO: PostProcess : [2024/02/26 22:10:06] ppocr INFO: name : CTCLabelDecode [2024/02/26 22:10:06] ppocr INFO: Train : [2024/02/26 22:10:06] ppocr INFO: dataset : [2024/02/26 22:10:06] ppocr INFO: data_dir : D:\PYTHON\pictures [2024/02/26 22:10:06] ppocr INFO: ds_width : False [2024/02/26 22:10:06] ppocr INFO: ext_op_transform_idx : 1 [2024/02/26 22:10:06] ppocr INFO: label_file_list : ['img\rec_train.txt'] [2024/02/26 22:10:06] ppocr INFO: name : MultiScaleDataSet [2024/02/26 22:10:06] ppocr INFO: transforms : [2024/02/26 22:10:06] ppocr INFO: DecodeImage : [2024/02/26 22:10:06] ppocr INFO: channel_first : False [2024/02/26 22:10:06] ppocr INFO: img_mode : BGR [2024/02/26 22:10:06] ppocr INFO: RecConAug : [2024/02/26 22:10:06] ppocr INFO: ext_data_num : 2 [2024/02/26 22:10:06] ppocr INFO: image_shape : [48, 96, 3] [2024/02/26 22:10:06] ppocr INFO: max_text_length : 10 [2024/02/26 22:10:06] ppocr INFO: prob : 0.5 [2024/02/26 22:10:06] ppocr INFO: RecAug : None [2024/02/26 22:10:06] ppocr INFO: MultiLabelEncode : [2024/02/26 22:10:06] ppocr INFO: gtc_encode : NRTRLabelEncode [2024/02/26 22:10:06] ppocr INFO: KeepKeys : [2024/02/26 22:10:06] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio'] [2024/02/26 22:10:06] ppocr INFO: loader : [2024/02/26 22:10:06] ppocr INFO: batch_size_per_card : 96 [2024/02/26 22:10:06] ppocr INFO: drop_last : True [2024/02/26 22:10:06] ppocr INFO: num_workers : 1 [2024/02/26 22:10:06] ppocr INFO: shuffle : True [2024/02/26 22:10:06] ppocr INFO: sampler : [2024/02/26 22:10:06] ppocr INFO: divided_factor : [8, 16] [2024/02/26 22:10:06] ppocr INFO: first_bs : 96 [2024/02/26 22:10:06] ppocr INFO: fix_bs : False [2024/02/26 22:10:06] ppocr INFO: is_training : True [2024/02/26 22:10:06] ppocr INFO: name : MultiScaleSampler [2024/02/26 22:10:06] ppocr INFO: scales : [[96, 32], [96, 48], [96, 64]] [2024/02/26 22:10:06] ppocr INFO: profiler_options : None [2024/02/26 22:10:06] ppocr INFO: train with paddle 2.3.2 and device Place(gpu:0) [2024/02/26 22:10:06] ppocr INFO: Initialize indexs of datasets:['img\rec_train.txt'] list index out of range [2024/02/26 22:10:06] ppocr INFO: Initialize indexs of datasets:['img\rec_eval.txt'] W0226 22:10:07.103623 23948 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.2 W0226 22:10:07.163653 23948 gpu_resources.cc:91] device: 0, cuDNN Version: 8.5. INFO 2024-02-26 22:10:19,364 optimizer.py:162] If regularizer of a Parameter has been set by 'paddle.ParamAttr' or 'static.WeightNormParamAttr' already. The weight_decay[3e-05] in Optimizer will not take effect, and it will only be applied to other Parameters! [2024/02/26 22:10:19] ppocr INFO: train dataloader has 72 iters [2024/02/26 22:10:19] ppocr INFO: valid dataloader has 10 iters [2024/02/26 22:10:20] ppocr INFO: load pretrain successful from ./pre_train_models/en_PP-OCRv4_rec_train/best_accuracy [2024/02/26 22:10:20] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations [2024/02/26 22:10:43] ppocr INFO: epoch: [1/50], global_step: 10, lr: 0.000006, acc: 0.041667, norm_edit_dis: 0.264400, CTCLoss: 22.112082, NRTRLoss: 0.000000, loss: 21.547119, avg_reader_cost: 1.32500 s, avg_batch_cost: 2.37332 s, avg_samples: 76.8, ips: 32.35970 samples/s, eta: 2:22:00 [2024/02/26 22:10:56] ppocr INFO: epoch: [1/50], global_step: 20, lr: 0.000013, acc: 0.044271, norm_edit_dis: 0.263883, CTCLoss: 21.502228, NRTRLoss: 0.000000, loss: 21.449989, avg_reader_cost: 0.87371 s, avg_batch_cost: 1.30937 s, avg_samples: 72.0, ips: 54.98843 samples/s, eta: 1:49:52 [2024/02/26 22:11:06] ppocr INFO: epoch: [1/50], global_step: 30, lr: 0.000027, acc: 0.052083, norm_edit_dis: 0.269422, CTCLoss: 20.095638, NRTRLoss: 0.000000, loss: 20.095638, avg_reader_cost: 0.54818 s, avg_batch_cost: 0.96542 s, avg_samples: 70.4, ips: 72.92178 samples/s, eta: 1:32:11 [2024/02/26 22:11:13] ppocr INFO: epoch: [1/50], global_step: 40, lr: 0.000041, acc: 0.052083, norm_edit_dis: 0.286861, CTCLoss: 15.908986, NRTRLoss: 0.000000, loss: 16.115395, avg_reader_cost: 0.24486 s, avg_batch_cost: 0.67578 s, avg_samples: 65.6, ips: 97.07327 samples/s, eta: 1:18:58 [2024/02/26 22:11:18] ppocr INFO: epoch: [1/50], global_step: 50, lr: 0.000055, acc: 0.083333, norm_edit_dis: 0.339697, CTCLoss: 12.769976, NRTRLoss: 0.000000, loss: 12.738501, avg_reader_cost: 0.08555 s, avg_batch_cost: 0.54633 s, avg_samples: 60.8, ips: 111.28837 samples/s, eta: 1:09:27 [2024/02/26 22:11:23] ppocr INFO: epoch: [1/50], global_step: 60, lr: 0.000069, acc: 0.151042, norm_edit_dis: 0.416667, CTCLoss: 10.439315, NRTRLoss: 0.000000, loss: 10.450922, avg_reader_cost: 0.02620 s, avg_batch_cost: 0.48454 s, avg_samples: 64.0, ips: 132.08500 samples/s, eta: 1:02:29 [2024/02/26 22:11:28] ppocr INFO: epoch: [1/50], global_step: 70, lr: 0.000083, acc: 0.182292, norm_edit_dis: 0.438015, CTCLoss: 8.423864, NRTRLoss: 0.000000, loss: 8.423864, avg_reader_cost: 0.00480 s, avg_batch_cost: 0.47775 s, avg_samples: 73.6, ips: 154.05539 samples/s, eta: 0:57:25 [2024/02/26 22:11:29] ppocr INFO: save model in ./output/rec_ppocr_v4\latest [2024/02/26 22:11:34] ppocr INFO: epoch: [2/50], global_step: 80, lr: 0.000097, acc: 0.231771, norm_edit_dis: 0.452865, CTCLoss: 7.541358, NRTRLoss: 0.000000, loss: 7.541358, avg_reader_cost: 0.06546 s, avg_batch_cost: 0.57578 s, avg_samples: 64.0, ips: 111.15327 samples/s, eta: 0:54:18 [2024/02/26 22:11:39] ppocr INFO: epoch: [2/50], global_step: 90, lr: 0.000110, acc: 0.234375, norm_edit_dis: 0.441927, CTCLoss: 7.055272, NRTRLoss: 0.000000, loss: 7.055272, avg_reader_cost: 0.00020 s, avg_batch_cost: 0.48734 s, avg_samples: 67.2, ips: 137.89159 samples/s, eta: 0:51:18 [2024/02/26 22:11:43] ppocr INFO: epoch: [2/50], global_step: 100, lr: 0.000124, acc: 0.250000, norm_edit_dis: 0.442758, CTCLoss: 6.601116, NRTRLoss: 0.000000, loss: 6.580836, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.47722 s, avg_samples: 68.8, ips: 144.16936 samples/s, eta: 0:48:49 [2024/02/26 22:11:48] ppocr INFO: epoch: [2/50], global_step: 110, lr: 0.000138, acc: 0.250000, norm_edit_dis: 0.447743, CTCLoss: 6.323363, NRTRLoss: 0.000000, loss: 6.296344, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.48471 s, avg_samples: 73.6, ips: 151.84265 samples/s, eta: 0:46:49 [2024/02/26 22:11:53] ppocr INFO: epoch: [2/50], global_step: 120, lr: 0.000152, acc: 0.250000, norm_edit_dis: 0.437240, CTCLoss: 6.173958, NRTRLoss: 0.000000, loss: 6.173958, avg_reader_cost: 0.00045 s, avg_batch_cost: 0.47134 s, avg_samples: 65.6, ips: 139.17896 samples/s, eta: 0:45:04 [2024/02/26 22:11:58] ppocr INFO: epoch: [2/50], global_step: 130, lr: 0.000166, acc: 0.276042, norm_edit_dis: 0.462761, CTCLoss: 5.208588, NRTRLoss: 0.000000, loss: 5.204882, avg_reader_cost: 0.00050 s, avg_batch_cost: 0.49596 s, avg_samples: 75.2, ips: 151.62433 samples/s, eta: 0:43:41 [2024/02/26 22:12:03] ppocr INFO: epoch: [2/50], global_step: 140, lr: 0.000180, acc: 0.286458, norm_edit_dis: 0.454948, CTCLoss: 5.117970, NRTRLoss: -0.000000, loss: 5.114107, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.45830 s, avg_samples: 73.6, ips: 160.59267 samples/s, eta: 0:42:20 [2024/02/26 22:12:04] ppocr INFO: save model in ./output/rec_ppocr_v4\latest [2024/02/26 22:12:08] ppocr INFO: epoch: [3/50], global_step: 150, lr: 0.000194, acc: 0.229167, norm_edit_dis: 0.407379, CTCLoss: 5.243576, NRTRLoss: 0.000000, loss: 5.243576, avg_reader_cost: 0.06199 s, avg_batch_cost: 0.57789 s, avg_samples: 73.6, ips: 127.36036 samples/s, eta: 0:41:36 [2024/02/26 22:12:13] ppocr INFO: epoch: [3/50], global_step: 160, lr: 0.000208, acc: 0.229167, norm_edit_dis: 0.414497, CTCLoss: 4.979708, NRTRLoss: 0.000000, loss: 4.993014, avg_reader_cost: 0.00000 s, avg_batch_cost: 0.46680 s, avg_samples: 68.8, ips: 147.38596 samples/s, eta: 0:40:34 [2024/02/26 22:12:18] ppocr INFO: epoch: [3/50], global_step: 170, lr: 0.000222, acc: 0.171875, norm_edit_dis: 0.334722, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.48857 s, avg_samples: 68.8, ips: 140.82051 samples/s, eta: 0:39:42 [2024/02/26 22:12:23] ppocr INFO: epoch: [3/50], global_step: 180, lr: 0.000235, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00010 s, avg_batch_cost: 0.48262 s, avg_samples: 70.4, ips: 145.87084 samples/s, eta: 0:38:55 [2024/02/26 22:12:27] ppocr INFO: epoch: [3/50], global_step: 190, lr: 0.000249, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00025 s, avg_batch_cost: 0.47050 s, avg_samples: 76.8, ips: 163.23136 samples/s, eta: 0:38:10 [2024/02/26 22:12:32] ppocr INFO: epoch: [3/50], global_step: 200, lr: 0.000263, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00025 s, avg_batch_cost: 0.48621 s, avg_samples: 64.0, ips: 131.63061 samples/s, eta: 0:37:32 [2024/02/26 22:12:37] ppocr INFO: epoch: [3/50], global_step: 210, lr: 0.000277, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00025 s, avg_batch_cost: 0.46303 s, avg_samples: 68.8, ips: 148.58720 samples/s, eta: 0:36:53 [2024/02/26 22:12:38] ppocr INFO: save model in ./output/rec_ppocr_v4\latest [2024/02/26 22:12:42] ppocr INFO: epoch: [4/50], global_step: 220, lr: 0.000291, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.05028 s, avg_batch_cost: 0.52260 s, avg_samples: 59.2, ips: 113.27984 samples/s, eta: 0:36:26 [2024/02/26 22:12:47] ppocr INFO: epoch: [4/50], global_step: 230, lr: 0.000305, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00030 s, avg_batch_cost: 0.48810 s, avg_samples: 68.8, ips: 140.95372 samples/s, eta: 0:35:56 [2024/02/26 22:12:52] ppocr INFO: epoch: [4/50], global_step: 240, lr: 0.000319, acc: 0.000000, norm_edit_dis: 0.000000, CTCLoss: nanxxx, NRTRLoss: nanxxx, loss: nanxxx, avg_reader_cost: 0.00036 s, avg_batch_cost: 0.47478 s, avg_samples: 67.2, ips: 141.53904 samples/s, eta: 0:35:26 Process finished with exit code -1073741510 (0xC000013A: interrupted by Ctrl+C)
训练数据大概有多少条呢,是几卡训练的。可以尝试把loss再减小10倍。
训练数据大概有多少条呢,是几卡训练的。可以尝试把loss再减小10倍。
您好,我是单卡训练的,数据集6000张,然后是这样的图片,我在v3是可以正常训练的,v4不行
文本长度较短,V4中有一些增广的策略可能在过短文本上不适用。可以尝试关闭: 删除sampler: https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv4/en_PP-OCRv4_rec.yml#L102-L115 修改: https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv4/en_PP-OCRv4_rec.yml#L73-L78 为: https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml#L79-L83
文本长度较短,V4中有一些增广的策略可能在过短文本上不适用。可以尝试关闭: 删除sampler:
https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv4/en_PP-OCRv4_rec.yml#L102-L115
修改: https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv4/en_PP-OCRv4_rec.yml#L73-L78
为: https://github.com/PaddlePaddle/PaddleOCR/blob/0525f6bb01bfed401f767894619f6a25ee750892/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml#L79-L83
感谢回复,按照您说的,去掉sampler,并且将配置文件中的数据集改为simpledataset,还是训练几个step acc 变为0,loss变为nanxx,您看还有其他需要调整的地方吗?
Global: debug: false use_gpu: true epoch_num: 50 log_smooth_window: 20 print_batch_step: 10 save_model_dir: ./output/rec_ppocr_v4 save_epoch_step: 10 eval_batch_step:
- 0
- 2000
cal_metric_during_train: true
pretrained_model: refactor
checkpoints: null
save_inference_dir: null
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/en_dict.txt
max_text_length: &max_text_length 10
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead: Neck: name: svtr dims: 120 depth: 2 hidden_dims: 120 kernel_size: - 1 - 3 use_guide: true Head: fc_decay: 1.0e-05
- NRTRHead: nrtr_dim: 384 max_text_length: *max_text_length Loss: name: MultiLoss loss_config_list:
- CTCLoss: null
- NRTRLoss: null
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
ignore_space: false
Train:
dataset:
name: SimpleDataSet
data_dir: D:\PYTHON\pictures
ext_op_transform_idx: 1
label_file_list:
- D:\PYTHON\pictures\img\rec_train.txt transforms:
- DecodeImage: img_mode: BGR channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape:
- 48
- 96
- 3 max_text_length: *max_text_length
- RecAug: null
- MultiLabelEncode: gtc_encode: NRTRLabelEncode
- RecResizeImg: image_shape: [ 3, 48, 96 ]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader: shuffle: true batch_size_per_card: 16 drop_last: true num_workers: 1 Eval: dataset: name: SimpleDataSet data_dir: D:\PYTHON\pictures label_file_list: - D:\PYTHON\pictures\img\rec_eval.txt transforms: - DecodeImage: img_mode: BGR channel_first: false - MultiLabelEncode: gtc_encode: NRTRLabelEncode - RecResizeImg: image_shape: - 3 - 48 - 96 - KeepKeys: keep_keys: - image - label_ctc - label_gtc - length - valid_ratio loader: shuffle: false drop_last: false batch_size_per_card: 16 num_workers: 1 profiler_options: null