PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

No matter what i do i got acc: 0.000000

Open MohieEldinMuhammad opened this issue 2 years ago • 9 comments

This is my config.yml

Global:
  debug: false
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 5
  save_model_dir: ./output/v3_arabic_mobile
  save_epoch_step: 1
  eval_batch_step: [0, 100]
  cal_metric_during_train: true
  pretrained_model:
  checkpoints: arabic_PP-OCRv3_rec_train/best_accuracy
  save_inference_dir:
  use_visualdl: false
  infer_img:
  character_dict_path: ppocr/utils/dict/arabic_dict.txt
  max_text_length: &max_text_length 25
  infer_mode: false
  use_space_char: true
  distributed: true
  save_res_path: ./output/rec/predicts_ppocrv3_arabic.txt


Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.001
    warmup_epoch: 5
  regularizer:
    name: L2
    factor: 3.0e-05


Architecture:
  model_type: rec
  algorithm: SVTR
  Transform:
  Backbone:
    name: MobileNetV1Enhance
    scale: 0.5
    last_conv_stride: [1, 2]
    last_pool_type: avg
  Head:
    name: MultiHead
    head_list:
      - CTCHead:
          Neck:
            name: svtr
            dims: 64
            depth: 2
            hidden_dims: 120
            use_guide: True
          Head:
            fc_decay: 0.00001
      - SARHead:
          enc_dim: 512
          max_text_length: *max_text_length

Loss:
  name: MultiLoss
  loss_config_list:
    - CTCLoss:
    - SARLoss:

PostProcess:  
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc
  ignore_space: False

Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./
    ext_op_transform_idx: 1
    label_file_list: ./train_data/rec_gt_train.txt
    
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - RecConAug:
        prob: 0.5
        ext_data_num: 2
        image_shape: [48, 320, 3]
    - RecAug:
    - MultiLabelEncode:
    - RecResizeImg:
        image_shape: [3, 48, 320]
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_sar
        - length
        - valid_ratio
  loader:
    shuffle: true
    batch_size_per_card: 8
    drop_last: true
    num_workers: 4
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./
    label_file_list: ./train_data/rec_gt_test.txt
    
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - MultiLabelEncode:
    - RecResizeImg:
        image_shape: [3, 48, 320]
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_sar
        - length
        - valid_ratio
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 8
    num_workers: 4

This is smaple from the data: image

train is 800 images and test is 200 images. I'm getting 0 acc, i even added the same images in train and test to overfit and see the accuracy increase and it gave me 0 acc also

[2022/10/21 10:14:42] ppocr INFO:     Backbone : 
[2022/10/21 10:14:42] ppocr INFO:         last_conv_stride : [1, 2]
[2022/10/21 10:14:42] ppocr INFO:         last_pool_type : avg
[2022/10/21 10:14:42] ppocr INFO:         name : MobileNetV1Enhance
[2022/10/21 10:14:42] ppocr INFO:         scale : 0.5
[2022/10/21 10:14:42] ppocr INFO:     Head : 
[2022/10/21 10:14:42] ppocr INFO:         head_list : 
[2022/10/21 10:14:42] ppocr INFO:             CTCHead : 
[2022/10/21 10:14:42] ppocr INFO:                 Head : 
[2022/10/21 10:14:42] ppocr INFO:                     fc_decay : 1e-05
[2022/10/21 10:14:42] ppocr INFO:                 Neck : 
[2022/10/21 10:14:42] ppocr INFO:                     depth : 2
[2022/10/21 10:14:42] ppocr INFO:                     dims : 64
[2022/10/21 10:14:42] ppocr INFO:                     hidden_dims : 120
[2022/10/21 10:14:42] ppocr INFO:                     name : svtr
[2022/10/21 10:14:42] ppocr INFO:                     use_guide : True
[2022/10/21 10:14:42] ppocr INFO:             SARHead : 
[2022/10/21 10:14:42] ppocr INFO:                 enc_dim : 512
[2022/10/21 10:14:42] ppocr INFO:                 max_text_length : 100
[2022/10/21 10:14:42] ppocr INFO:         name : MultiHead
[2022/10/21 10:14:42] ppocr INFO:     Transform : None
[2022/10/21 10:14:42] ppocr INFO:     algorithm : SVTR
[2022/10/21 10:14:42] ppocr INFO:     model_type : rec
[2022/10/21 10:14:42] ppocr INFO: Eval : 
[2022/10/21 10:14:42] ppocr INFO:     dataset : 
[2022/10/21 10:14:42] ppocr INFO:         data_dir : ./train_data
[2022/10/21 10:14:42] ppocr INFO:         label_file_list : ./train_data/rec_gt_test.txt
[2022/10/21 10:14:42] ppocr INFO:         name : SimpleDataSet
[2022/10/21 10:14:42] ppocr INFO:         transforms : 
[2022/10/21 10:14:42] ppocr INFO:             DecodeImage : 
[2022/10/21 10:14:42] ppocr INFO:                 channel_first : False
[2022/10/21 10:14:42] ppocr INFO:                 img_mode : BGR
[2022/10/21 10:14:42] ppocr INFO:             MultiLabelEncode : None
[2022/10/21 10:14:42] ppocr INFO:             RecResizeImg : 
[2022/10/21 10:14:42] ppocr INFO:                 image_shape : [3, 48, 320]
[2022/10/21 10:14:42] ppocr INFO:             KeepKeys : 
[2022/10/21 10:14:42] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2022/10/21 10:14:42] ppocr INFO:     loader : 
[2022/10/21 10:14:42] ppocr INFO:         batch_size_per_card : 2
[2022/10/21 10:14:42] ppocr INFO:         drop_last : False
[2022/10/21 10:14:42] ppocr INFO:         num_workers : 4
[2022/10/21 10:14:42] ppocr INFO:         shuffle : False
[2022/10/21 10:14:42] ppocr INFO: Global : 
[2022/10/21 10:14:42] ppocr INFO:     cal_metric_during_train : True
[2022/10/21 10:14:42] ppocr INFO:     character_dict_path : ppocr/utils/dict/arabic_dict.txt
[2022/10/21 10:14:42] ppocr INFO:     checkpoints : arabic_PP-OCRv3_rec_train/best_accuracy
[2022/10/21 10:14:42] ppocr INFO:     debug : True
[2022/10/21 10:14:42] ppocr INFO:     distributed : False
[2022/10/21 10:14:42] ppocr INFO:     epoch_num : 200
[2022/10/21 10:14:42] ppocr INFO:     eval_batch_step : [0, 100]
[2022/10/21 10:14:42] ppocr INFO:     infer_img : None
[2022/10/21 10:14:42] ppocr INFO:     infer_mode : False
[2022/10/21 10:14:42] ppocr INFO:     log_smooth_window : 20
[2022/10/21 10:14:42] ppocr INFO:     max_text_length : 100
[2022/10/21 10:14:42] ppocr INFO:     pretrained_model : None
[2022/10/21 10:14:42] ppocr INFO:     print_batch_step : 1
[2022/10/21 10:14:42] ppocr INFO:     save_epoch_step : 1
[2022/10/21 10:14:42] ppocr INFO:     save_inference_dir : None
[2022/10/21 10:14:42] ppocr INFO:     save_model_dir : ./output/v3_arabic_mobile
[2022/10/21 10:14:42] ppocr INFO:     save_res_path : ./output/rec/predicts_ppocrv3_arabic.txt
[2022/10/21 10:14:42] ppocr INFO:     use_gpu : True
[2022/10/21 10:14:42] ppocr INFO:     use_space_char : True
[2022/10/21 10:14:42] ppocr INFO:     use_visualdl : False
[2022/10/21 10:14:42] ppocr INFO: Loss : 
[2022/10/21 10:14:42] ppocr INFO:     loss_config_list : 
[2022/10/21 10:14:42] ppocr INFO:         CTCLoss : None
[2022/10/21 10:14:42] ppocr INFO:         SARLoss : None
[2022/10/21 10:14:42] ppocr INFO:     name : MultiLoss
[2022/10/21 10:14:42] ppocr INFO: Metric : 
[2022/10/21 10:14:42] ppocr INFO:     ignore_space : False
[2022/10/21 10:14:42] ppocr INFO:     main_indicator : acc
[2022/10/21 10:14:42] ppocr INFO:     name : RecMetric
[2022/10/21 10:14:42] ppocr INFO: Optimizer : 
[2022/10/21 10:14:42] ppocr INFO:     beta1 : 0.9
[2022/10/21 10:14:42] ppocr INFO:     beta2 : 0.999
[2022/10/21 10:14:42] ppocr INFO:     lr : 
[2022/10/21 10:14:42] ppocr INFO:         learning_rate : 0.001
[2022/10/21 10:14:42] ppocr INFO:         name : Cosine
[2022/10/21 10:14:42] ppocr INFO:         warmup_epoch : 5
[2022/10/21 10:14:42] ppocr INFO:     name : Adam
[2022/10/21 10:14:42] ppocr INFO:     regularizer : 
[2022/10/21 10:14:42] ppocr INFO:         factor : 3e-05
[2022/10/21 10:14:42] ppocr INFO:         name : L2
[2022/10/21 10:14:42] ppocr INFO: PostProcess : 
[2022/10/21 10:14:42] ppocr INFO:     name : CTCLabelDecode
[2022/10/21 10:14:42] ppocr INFO: Train : 
[2022/10/21 10:14:42] ppocr INFO:     dataset : 
[2022/10/21 10:14:42] ppocr INFO:         data_dir : ./train_data
[2022/10/21 10:14:42] ppocr INFO:         ext_op_transform_idx : 1
[2022/10/21 10:14:42] ppocr INFO:         label_file_list : ./train_data/rec_gt_train.txt
[2022/10/21 10:14:42] ppocr INFO:         name : SimpleDataSet
[2022/10/21 10:14:42] ppocr INFO:         transforms : 
[2022/10/21 10:14:42] ppocr INFO:             DecodeImage : 
[2022/10/21 10:14:42] ppocr INFO:                 channel_first : False
[2022/10/21 10:14:42] ppocr INFO:                 img_mode : BGR
[2022/10/21 10:14:42] ppocr INFO:             RecConAug : 
[2022/10/21 10:14:42] ppocr INFO:                 ext_data_num : 2
[2022/10/21 10:14:42] ppocr INFO:                 image_shape : [48, 320, 3]
[2022/10/21 10:14:42] ppocr INFO:                 prob : 0.5
[2022/10/21 10:14:42] ppocr INFO:             RecAug : None
[2022/10/21 10:14:42] ppocr INFO:             MultiLabelEncode : None
[2022/10/21 10:14:42] ppocr INFO:             RecResizeImg : 
[2022/10/21 10:14:42] ppocr INFO:                 image_shape : [3, 48, 320]
[2022/10/21 10:14:42] ppocr INFO:             KeepKeys : 
[2022/10/21 10:14:42] ppocr INFO:                 keep_keys : ['image', 'label_ctc', 'label_sar', 'length', 'valid_ratio']
[2022/10/21 10:14:42] ppocr INFO:     loader : 
[2022/10/21 10:14:42] ppocr INFO:         batch_size_per_card : 2
[2022/10/21 10:14:42] ppocr INFO:         drop_last : True
[2022/10/21 10:14:42] ppocr INFO:         num_workers : 4
[2022/10/21 10:14:42] ppocr INFO:         shuffle : True
[2022/10/21 10:14:42] ppocr INFO: profiler_options : None
[2022/10/21 10:14:42] ppocr INFO: train with paddle 2.3.2 and device Place(gpu:0)
[2022/10/21 10:14:42] ppocr INFO: Initialize indexs of datasets:./train_data/rec_gt_train.txt
[2022/10/21 10:14:42] ppocr INFO: Initialize indexs of datasets:./train_data/rec_gt_test.txt
W1021 10:14:42.292580  5755 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 10.2
W1021 10:14:42.299518  5755 gpu_resources.cc:91] device: 0, cuDNN Version: 8.1.
[2022/10/21 10:14:43] ppocr INFO: train dataloader has 2 iters
[2022/10/21 10:14:43] ppocr INFO: valid dataloader has 2 iters
[2022/10/21 10:14:43] ppocr INFO: resume from arabic_PP-OCRv3_rec_train/best_accuracy
[2022/10/21 10:14:43] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 100 iterations
[2022/10/21 10:14:46] ppocr INFO: epoch: [98/200], global_step: 1, lr: 0.000018, acc: 0.000000, norm_edit_dis: 0.183337, CTCLoss: 394.141571, SARLoss: 6.391978, loss: 400.533539, avg_reader_cost: 0.22181 s, avg_batch_cost: 2.87783 s, avg_samples: 2.0, ips: 0.69497 samples/s, eta: 0:09:49
[2022/10/21 10:14:46] ppocr INFO: epoch: [98/200], global_step: 2, lr: 0.000483, acc: 0.000000, norm_edit_dis: 0.129766, CTCLoss: 335.271667, SARLoss: 6.686296, loss: 341.957947, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.10519 s, avg_samples: 2.0, ips: 19.01316 samples/s, eta: 0:05:04
[2022/10/21 10:14:47] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:47] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_98
[2022/10/21 10:14:49] ppocr INFO: epoch: [99/200], global_step: 3, lr: 0.000949, acc: 0.000000, norm_edit_dis: 0.183337, CTCLoss: 276.401733, SARLoss: 6.391978, loss: 283.382355, avg_reader_cost: 2.92971 s, avg_batch_cost: 3.04973 s, avg_samples: 2.0, ips: 0.65580 samples/s, eta: 0:06:48
[2022/10/21 10:14:49] ppocr INFO: epoch: [99/200], global_step: 4, lr: 0.000950, acc: 0.000000, norm_edit_dis: 0.143628, CTCLoss: 335.271667, SARLoss: 6.125015, loss: 341.957947, avg_reader_cost: 0.00012 s, avg_batch_cost: 0.10430 s, avg_samples: 2.0, ips: 19.17620 samples/s, eta: 0:05:09
[2022/10/21 10:14:50] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:50] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_99
[2022/10/21 10:14:50] ppocr INFO: epoch: [100/200], global_step: 5, lr: 0.000951, acc: 0.000000, norm_edit_dis: 0.106065, CTCLoss: 276.401733, SARLoss: 5.858052, loss: 283.382355, avg_reader_cost: 1.07871 s, avg_batch_cost: 1.18832 s, avg_samples: 2.0, ips: 1.68305 samples/s, eta: 0:04:54
[2022/10/21 10:14:50] ppocr INFO: epoch: [100/200], global_step: 6, lr: 0.000952, acc: 0.000000, norm_edit_dis: 0.142429, CTCLoss: 268.454895, SARLoss: 5.320616, loss: 274.336761, avg_reader_cost: 0.01492 s, avg_batch_cost: 0.10398 s, avg_samples: 2.0, ips: 19.23419 samples/s, eta: 0:04:07
[2022/10/21 10:14:51] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:51] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_100
[2022/10/21 10:14:52] ppocr INFO: epoch: [101/200], global_step: 7, lr: 0.000952, acc: 0.000000, norm_edit_dis: 0.178792, CTCLoss: 260.507996, SARLoss: 4.783182, loss: 265.291168, avg_reader_cost: 1.34266 s, avg_batch_cost: 1.44850 s, avg_samples: 2.0, ips: 1.38074 samples/s, eta: 0:04:12
[2022/10/21 10:14:52] ppocr INFO: epoch: [101/200], global_step: 8, lr: 0.000953, acc: 0.000000, norm_edit_dis: 0.181065, CTCLoss: 217.588837, SARLoss: 4.735386, loss: 221.863815, avg_reader_cost: 0.00070 s, avg_batch_cost: 0.08536 s, avg_samples: 2.0, ips: 23.42912 samples/s, eta: 0:03:41
[2022/10/21 10:14:53] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:53] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_101
[2022/10/21 10:14:53] ppocr INFO: epoch: [102/200], global_step: 9, lr: 0.000954, acc: 0.000000, norm_edit_dis: 0.183337, CTCLoss: 174.669678, SARLoss: 4.687589, loss: 178.436462, avg_reader_cost: 1.16604 s, avg_batch_cost: 1.26441 s, avg_samples: 2.0, ips: 1.58176 samples/s, eta: 0:03:43
[2022/10/21 10:14:53] ppocr INFO: epoch: [102/200], global_step: 10, lr: 0.000955, acc: 0.000000, norm_edit_dis: 0.186700, CTCLoss: 217.588837, SARLoss: 4.361302, loss: 221.863815, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.08058 s, avg_samples: 2.0, ips: 24.82050 samples/s, eta: 0:03:22
[2022/10/21 10:14:54] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:56] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_102
[2022/10/21 10:14:56] ppocr INFO: epoch: [103/200], global_step: 11, lr: 0.000956, acc: 0.000000, norm_edit_dis: 0.183337, CTCLoss: 198.927856, SARLoss: 4.035015, loss: 201.810928, avg_reader_cost: 2.50583 s, avg_batch_cost: 2.62582 s, avg_samples: 2.0, ips: 0.76167 samples/s, eta: 0:03:49
[2022/10/21 10:14:56] ppocr INFO: epoch: [103/200], global_step: 12, lr: 0.000957, acc: 0.000000, norm_edit_dis: 0.186700, CTCLoss: 186.798767, SARLoss: 3.900904, loss: 190.123672, avg_reader_cost: 0.00020 s, avg_batch_cost: 0.10338 s, avg_samples: 2.0, ips: 19.34553 samples/s, eta: 0:03:30
[2022/10/21 10:14:57] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:14:57] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_103
[2022/10/21 10:14:59] ppocr INFO: epoch: [104/200], global_step: 13, lr: 0.000957, acc: 0.000000, norm_edit_dis: 0.190063, CTCLoss: 174.669678, SARLoss: 3.766792, loss: 178.436462, avg_reader_cost: 2.35733 s, avg_batch_cost: 2.47507 s, avg_samples: 2.0, ips: 0.80806 samples/s, eta: 0:03:50
[2022/10/21 10:14:59] ppocr INFO: epoch: [104/200], global_step: 14, lr: 0.000958, acc: 0.000000, norm_edit_dis: 0.190271, CTCLoss: 157.876434, SARLoss: 3.673026, loss: 162.103638, avg_reader_cost: 0.00013 s, avg_batch_cost: 0.10350 s, avg_samples: 2.0, ips: 19.32343 samples/s, eta: 0:03:34
[2022/10/21 10:14:59] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:00] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_104
[2022/10/21 10:15:01] ppocr INFO: epoch: [105/200], global_step: 15, lr: 0.000959, acc: 0.000000, norm_edit_dis: 0.190480, CTCLoss: 141.083206, SARLoss: 3.579259, loss: 145.770798, avg_reader_cost: 1.74843 s, avg_batch_cost: 1.89828 s, avg_samples: 2.0, ips: 1.05358 samples/s, eta: 0:03:43
[2022/10/21 10:15:02] ppocr INFO: epoch: [105/200], global_step: 16, lr: 0.000960, acc: 0.000000, norm_edit_dis: 0.191075, CTCLoss: 120.322723, SARLoss: 3.520793, loss: 124.007881, avg_reader_cost: 1.66592 s, avg_batch_cost: 1.77326 s, avg_samples: 2.0, ips: 1.12786 samples/s, eta: 0:03:49
[2022/10/21 10:15:03] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:03] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_105
[2022/10/21 10:15:04] ppocr INFO: epoch: [106/200], global_step: 17, lr: 0.000960, acc: 0.000000, norm_edit_dis: 0.191671, CTCLoss: 99.562241, SARLoss: 3.462327, loss: 102.244965, avg_reader_cost: 1.47188 s, avg_batch_cost: 1.59000 s, avg_samples: 2.0, ips: 1.25786 samples/s, eta: 0:03:52
[2022/10/21 10:15:04] ppocr INFO: epoch: [106/200], global_step: 18, lr: 0.000961, acc: 0.000000, norm_edit_dis: 0.192763, CTCLoss: 94.925621, SARLoss: 3.398798, loss: 97.778091, avg_reader_cost: 0.00013 s, avg_batch_cost: 0.10358 s, avg_samples: 2.0, ips: 19.30804 samples/s, eta: 0:03:39
[2022/10/21 10:15:05] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:05] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_106
[2022/10/21 10:15:06] ppocr INFO: epoch: [107/200], global_step: 19, lr: 0.000962, acc: 0.000000, norm_edit_dis: 0.193854, CTCLoss: 90.289001, SARLoss: 3.335268, loss: 93.311218, avg_reader_cost: 1.52029 s, avg_batch_cost: 1.63275 s, avg_samples: 2.0, ips: 1.22492 samples/s, eta: 0:03:42
[2022/10/21 10:15:06] ppocr INFO: epoch: [107/200], global_step: 20, lr: 0.000963, acc: 0.000000, norm_edit_dis: 0.198852, CTCLoss: 84.507553, SARLoss: 3.307106, loss: 87.446266, avg_reader_cost: 0.16469 s, avg_batch_cost: 0.26266 s, avg_samples: 2.0, ips: 7.61447 samples/s, eta: 0:03:32
[2022/10/21 10:15:06] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:07] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_107
[2022/10/21 10:15:08] ppocr INFO: epoch: [108/200], global_step: 21, lr: 0.000964, acc: 0.000000, norm_edit_dis: 0.205870, CTCLoss: 72.377457, SARLoss: 3.150580, loss: 75.053223, avg_reader_cost: 1.50790 s, avg_batch_cost: 1.62894 s, avg_samples: 2.0, ips: 1.22779 samples/s, eta: 0:03:35
[2022/10/21 10:15:08] ppocr INFO: epoch: [108/200], global_step: 22, lr: 0.000966, acc: 0.000000, norm_edit_dis: 0.208745, CTCLoss: 64.772110, SARLoss: 2.952643, loss: 67.659752, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.10482 s, avg_samples: 2.0, ips: 19.07989 samples/s, eta: 0:03:25
[2022/10/21 10:15:08] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:09] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_108
[2022/10/21 10:15:09] ppocr INFO: epoch: [109/200], global_step: 23, lr: 0.000967, acc: 0.000000, norm_edit_dis: 0.211620, CTCLoss: 64.772110, SARLoss: 2.869136, loss: 67.659752, avg_reader_cost: 1.64637 s, avg_batch_cost: 1.75551 s, avg_samples: 2.0, ips: 1.13927 samples/s, eta: 0:03:29
[2022/10/21 10:15:10] ppocr INFO: epoch: [109/200], global_step: 24, lr: 0.000968, acc: 0.000000, norm_edit_dis: 0.219844, CTCLoss: 64.772110, SARLoss: 2.828163, loss: 67.659752, avg_reader_cost: 0.00015 s, avg_batch_cost: 0.09609 s, avg_samples: 2.0, ips: 20.81365 samples/s, eta: 0:03:20
[2022/10/21 10:15:10] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:11] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_109
[2022/10/21 10:15:11] ppocr INFO: epoch: [110/200], global_step: 25, lr: 0.000970, acc: 0.000000, norm_edit_dis: 0.228815, CTCLoss: 61.830845, SARLoss: 2.769616, loss: 65.259949, avg_reader_cost: 1.56045 s, avg_batch_cost: 1.66414 s, avg_samples: 2.0, ips: 1.20182 samples/s, eta: 0:03:23
[2022/10/21 10:15:11] ppocr INFO: epoch: [110/200], global_step: 26, lr: 0.000971, acc: 0.000000, norm_edit_dis: 0.238916, CTCLoss: 64.772110, SARLoss: 2.710417, loss: 67.659752, avg_reader_cost: 0.04279 s, avg_batch_cost: 0.13024 s, avg_samples: 2.0, ips: 15.35599 samples/s, eta: 0:03:15
[2022/10/21 10:15:12] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:12] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_110
[2022/10/21 10:15:13] ppocr INFO: epoch: [111/200], global_step: 27, lr: 0.000972, acc: 0.000000, norm_edit_dis: 0.238029, CTCLoss: 66.473892, SARLoss: 2.664080, loss: 68.694649, avg_reader_cost: 1.04785 s, avg_batch_cost: 1.17008 s, avg_samples: 2.0, ips: 1.70929 samples/s, eta: 0:03:15
[2022/10/21 10:15:13] ppocr INFO: epoch: [111/200], global_step: 28, lr: 0.000974, acc: 0.000000, norm_edit_dis: 0.238029, CTCLoss: 64.772110, SARLoss: 2.635515, loss: 67.659752, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.09467 s, avg_samples: 2.0, ips: 21.12696 samples/s, eta: 0:03:07
[2022/10/21 10:15:13] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:14] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_111
[2022/10/21 10:15:14] ppocr INFO: epoch: [112/200], global_step: 29, lr: 0.000975, acc: 0.000000, norm_edit_dis: 0.238029, CTCLoss: 65.063431, SARLoss: 2.583038, loss: 67.659752, avg_reader_cost: 1.22625 s, avg_batch_cost: 1.33414 s, avg_samples: 2.0, ips: 1.49910 samples/s, eta: 0:03:08
[2022/10/21 10:15:15] ppocr INFO: epoch: [112/200], global_step: 30, lr: 0.000976, acc: 0.000000, norm_edit_dis: 0.238029, CTCLoss: 63.806744, SARLoss: 2.518409, loss: 66.475739, avg_reader_cost: 0.84602 s, avg_batch_cost: 0.95220 s, avg_samples: 2.0, ips: 2.10041 samples/s, eta: 0:03:06
[2022/10/21 10:15:15] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:16] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_112
[2022/10/21 10:15:16] ppocr INFO: epoch: [113/200], global_step: 31, lr: 0.000977, acc: 0.000000, norm_edit_dis: 0.245363, CTCLoss: 63.806744, SARLoss: 2.362904, loss: 66.475739, avg_reader_cost: 1.16328 s, avg_batch_cost: 1.28003 s, avg_samples: 2.0, ips: 1.56246 samples/s, eta: 0:03:06
[2022/10/21 10:15:16] ppocr INFO: epoch: [113/200], global_step: 32, lr: 0.000978, acc: 0.000000, norm_edit_dis: 0.252248, CTCLoss: 59.601883, SARLoss: 2.191519, loss: 61.891373, avg_reader_cost: 0.00016 s, avg_batch_cost: 0.10387 s, avg_samples: 2.0, ips: 19.25521 samples/s, eta: 0:03:00
[2022/10/21 10:15:17] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:17] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_113
[2022/10/21 10:15:18] ppocr INFO: epoch: [114/200], global_step: 33, lr: 0.000980, acc: 0.000000, norm_edit_dis: 0.245363, CTCLoss: 63.806744, SARLoss: 2.106306, loss: 66.475739, avg_reader_cost: 1.15275 s, avg_batch_cost: 1.26946 s, avg_samples: 2.0, ips: 1.57547 samples/s, eta: 0:03:00
[2022/10/21 10:15:20] ppocr INFO: epoch: [114/200], global_step: 34, lr: 0.000981, acc: 0.000000, norm_edit_dis: 0.252248, CTCLoss: 59.601883, SARLoss: 2.054793, loss: 61.891373, avg_reader_cost: 2.01648 s, avg_batch_cost: 2.12311 s, avg_samples: 2.0, ips: 0.94202 samples/s, eta: 0:03:05
[2022/10/21 10:15:20] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:21] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_114
[2022/10/21 10:15:22] ppocr INFO: epoch: [115/200], global_step: 35, lr: 0.000982, acc: 0.000000, norm_edit_dis: 0.252248, CTCLoss: 62.577076, SARLoss: 2.044794, loss: 64.666428, avg_reader_cost: 1.77667 s, avg_batch_cost: 1.89515 s, avg_samples: 2.0, ips: 1.05533 samples/s, eta: 0:03:07
[2022/10/21 10:15:23] ppocr INFO: epoch: [115/200], global_step: 36, lr: 0.000983, acc: 0.000000, norm_edit_dis: 0.259619, CTCLoss: 62.868393, SARLoss: 1.998415, loss: 64.666428, avg_reader_cost: 1.58293 s, avg_batch_cost: 1.68790 s, avg_samples: 2.0, ips: 1.18490 samples/s, eta: 0:03:09
[2022/10/21 10:15:24] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:24] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_115
[2022/10/21 10:15:25] ppocr INFO: epoch: [116/200], global_step: 37, lr: 0.000984, acc: 0.000000, norm_edit_dis: 0.276331, CTCLoss: 62.868393, SARLoss: 1.951480, loss: 64.666428, avg_reader_cost: 1.33913 s, avg_batch_cost: 1.45781 s, avg_samples: 2.0, ips: 1.37192 samples/s, eta: 0:03:10
[2022/10/21 10:15:25] ppocr INFO: epoch: [116/200], global_step: 38, lr: 0.000985, acc: 0.000000, norm_edit_dis: 0.259619, CTCLoss: 61.491783, SARLoss: 1.941236, loss: 63.062988, avg_reader_cost: 0.00014 s, avg_batch_cost: 0.10557 s, avg_samples: 2.0, ips: 18.94412 samples/s, eta: 0:03:04
[2022/10/21 10:15:26] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:26] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_116
[2022/10/21 10:15:27] ppocr INFO: epoch: [117/200], global_step: 39, lr: 0.000986, acc: 0.000000, norm_edit_dis: 0.259619, CTCLoss: 61.491783, SARLoss: 1.807949, loss: 63.062988, avg_reader_cost: 1.76529 s, avg_batch_cost: 1.87486 s, avg_samples: 2.0, ips: 1.06675 samples/s, eta: 0:03:06
[2022/10/21 10:15:27] ppocr INFO: epoch: [117/200], global_step: 40, lr: 0.000987, acc: 0.000000, norm_edit_dis: 0.259619, CTCLoss: 61.491783, SARLoss: 1.807949, loss: 63.062988, avg_reader_cost: 0.20065 s, avg_batch_cost: 0.30993 s, avg_samples: 2.0, ips: 6.45307 samples/s, eta: 0:03:02
[2022/10/21 10:15:28] ppocr INFO: save model in ./output/v3_arabic_mobile/latest
[2022/10/21 10:15:29] ppocr INFO: save model in ./output/v3_arabic_mobile/iter_epoch_117
[2022/10/21 10:15:29] ppocr INFO: epoch: [118/200], global_step: 41, lr: 0.000988, acc: 0.000000, norm_edit_dis: 0.252248, CTCLoss: 62.868393, SARLoss: 1.641998, loss: 64.666428, avg_reader_cost: 1.94799 s, avg_batch_cost: 2.08293 s, avg_samples: 2.0, ips: 0.96019 samples/s, eta: 0:03:05
[2022/10/21 10:15:29] ppocr INFO: epoch: [118/200], global_step: 42, lr: 0.000988, acc: 0.000000, norm_edit_dis: 0.252248, 

MohieEldinMuhammad avatar Oct 21 '22 16:10 MohieEldinMuhammad

any help ?

MohieEldinMuhammad avatar Oct 24 '22 10:10 MohieEldinMuhammad

Hi, you can try the following suggestions: 1)check if the character_dict_path has the same characters as your training characters 2)I find some pictures have long characters, whether the picture size is too large(will be scaled to 320*48) If you have any question, please contact us again.

an1018 avatar Oct 25 '22 02:10 an1018

@an1018 thanks for your help

  1. the character_dict is Arabic and English letters with the numbers and special characters and used it with the same data but they were separated to words and the accuracy increased normally so the character_dict is good
  2. should i increase the max number of characters than 25 ?

mohiiieldin avatar Oct 25 '22 06:10 mohiiieldin

1、character_dict file should be consistent with the annotated information 2、If the shape of image is big, when resizing the image to 320*48, the text will be compressed, and the accuracy may not good

an1018 avatar Oct 25 '22 06:10 an1018

1、character_dict file should be consistent with the annotated information 2、If the shape of image is big, when resizing the image to 320*48, the text will be compressed, and the accuracy may not good

  1. i didn't get what do you mean with annotated information
  2. so can i increase the image size or there are other solutions to this point?

mohiiieldin avatar Oct 25 '22 07:10 mohiiieldin

@an1018

MohieEldinMuhammad avatar Oct 26 '22 13:10 MohieEldinMuhammad

1、character_dict_path : dictionary file; label_file_list:image annotated information You need to ensure that all annoted information appear in the dictionary file 2、1)You can decrease the length of characters 2)Or, if you still use such long data for training and prediction, you need to increase the shape of the training data accordingly, such as [3, 32, 640] 3、It's also possible that the num of dataset is too small, you can try increase the number of images

You can refer to #6632

an1018 avatar Oct 27 '22 01:10 an1018

@an1018 i can change the input shape without the need to start the model training from scratch? Also how many image with be a good starting point ? Can you run a small experiment with one image in the train and the same image in test and run it for n epochs you will notice that the accuracy is zero also !

mohiiieldin avatar Oct 27 '22 14:10 mohiiieldin

i also have noticed that the runing !python tools/infer_rec.py -c arabic_PP-OCRv3_rec_train/config.yml on image:

image

is giving the same results like running the same image on the library:

ocr = PaddleOCR(lang=lang,show_log = False,use_angle_cls = True,unclip_ratio=3)
result = ocr.ocr(image_path)

image

why this is happening ? shouldn't the best_accuracy model that we use for training is the same one used in production by the paddleocr package ?

MohieEldinMuhammad avatar Oct 27 '22 16:10 MohieEldinMuhammad