PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

训练英文手写数据集,acc一直很低(37%)

Open chenggm opened this issue 6 months ago • 0 comments

🔎 Search before asking

  • [X] I have searched the PaddleOCR Docs and found no similar bug report.
  • [X] I have searched the PaddleOCR Issues and found no similar bug report.
  • [X] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

数据集地址:https://aistudio.baidu.com/datasetdetail/128823 训练模型地址:https://paddleocr.bj.bcebos.com/PP-OCRv4/english/en_PP-OCRv4_rec_train.tar 配置文件:en_PP-OCRv4_rec.yml

Global:
  debug: false
  use_gpu: true
  epoch_num: 200
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/rec_ppocr_v4
  save_epoch_step: 10
  eval_batch_step:
  - 0
  - 100
  cal_metric_during_train: true
  pretrained_model: ./pretrained_model/en_PP-OCRv4_rec_train/best_accuracy.pdparams
  checkpoints: 
  save_inference_dir: 
  use_visualdl: false
  infer_img: doc/imgs_words/ch/word_1.jpg
  character_dict_path: ppocr/utils/en_dict.txt
  max_text_length: 25
  infer_mode: false
  use_space_char: true
  distributed: true
  save_res_path: ./output/rec/predicts_ppocrv3.txt
Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.0005
    warmup_epoch: 5
  regularizer:
    name: L2
    factor: 3.0e-05
Architecture:
  model_type: rec
  algorithm: SVTR_LCNet
  Transform: null
  Backbone:
    name: PPLCNetV3
    scale: 0.95
  Head:
    name: MultiHead
    head_list:
    - CTCHead:
        Neck:
          name: svtr
          dims: 120
          depth: 2
          hidden_dims: 120
          kernel_size:
          - 1
          - 3
          use_guide: true
        Head:
          fc_decay: 1.0e-05
    - NRTRHead:
        nrtr_dim: 384
        max_text_length: 25
Loss:
  name: MultiLoss
  loss_config_list:
  - CTCLoss: null
  - NRTRLoss: null
PostProcess:
  name: CTCLabelDecode
Metric:
  name: RecMetric
  main_indicator: acc
  ignore_space: false
Train:
  dataset:
    name: MultiScaleDataSet
    ds_width: false
    data_dir: ./train_data/tal/
    ext_op_transform_idx: 1
    label_file_list:
    - ./train_data/tal/train.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - RecConAug:
        prob: 0.5
        ext_data_num: 2
        image_shape:
        - 48
        - 320
        - 3
        max_text_length: 25
    - RecAug: null
    - MultiLabelEncode:
        gtc_encode: NRTRLabelEncode
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_gtc
        - length
        - valid_ratio
  sampler:
    name: MultiScaleSampler
    scales:
    - - 320
      - 32
    - - 320
      - 48
    - - 320
      - 64
    first_bs: 96
    fix_bs: false
    divided_factor:
    - 8
    - 16
    is_training: true
  loader:
    shuffle: true
    batch_size_per_card: 96
    drop_last: true
    num_workers: 8
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/tal
    label_file_list:
    - ./train_data/tal/test.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - MultiLabelEncode:
        gtc_encode: NRTRLabelEncode
    - RecResizeImg:
        image_shape:
        - 3
        - 48
        - 320
    - KeepKeys:
        keep_keys:
        - image
        - label_ctc
        - label_gtc
        - length
        - valid_ratio
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 128
    num_workers: 4
profiler_options: null

1724900692247

🏃‍♂️ Environment (运行环境)

GPU: V100 16G 4卡 python 3.10.14 paddle 2.6.1 cuDNN Version: 8.4.

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

python3 -m paddle.distributed.launch --log_dir=./debug/ --devices=0,1,2,3,4,5,6,7 tools/train.py -c train_data/en_PP-CRv4_rec.yml

chenggm avatar Aug 29 '24 03:08 chenggm