PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

0.00000 combineloss when using ch_PP-OCRv4_det_cml.yaml

Open Sundragon1993 opened this issue 1 year ago • 3 comments

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:Ubuntu 20.04
  • 版本号/Version:Paddle: PaddleOCR:2.7.0 问题相关组件/Related components:
  • 运行指令/Command Code:Training phase
  • 完整报错/Complete Error Message:db_Student_loss_cbn: 0.000000, db_Student2_loss_cbn: 0.000000

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):

Dear team, I'm trying to train the cml config using ch_PP-OCRv4_det_cml.yaml on my custom dataset but somehow the combine loss is always get 0. The program works fine when using ch_PP-OCRv4_det_teacher.yaml and ch_PP-OCRv4_det_student.yaml. Here is my config, I just modified the dataset path:

Global:
  debug: false
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 20
  save_model_dir: ./output/ch_PP-OCRv4-cml-v1
  save_epoch_step: 100
  eval_batch_step:
  - 0
  - 1000
  cal_metric_during_train: False
  checkpoints: null
  pretrained_model: null
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true
Architecture:
  name: DistillationModel
  algorithm: Distillation
  model_type: det
  Models:
    Student:
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: PPLCNetV3
        scale: 0.75
        det: True
        pretrained: false
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: true
      Head:
        name: DBHead
        k: 50
    Student2:
      pretrained: null
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: PPLCNetV3
        scale: 0.75
        det: True
        pretrained: true
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: true
      Head:
        name: DBHead
        k: 50
    Teacher:
      pretrained: /home/hoangdc/workspace/PaddleOCR/.paddleocr/models/teacher.pdparams
      freeze_params: true
      return_all_feats: false
      model_type: det
      algorithm: DB
      Backbone:
        name: ResNet_vd
        in_channels: 3
        layers: 50
      Neck:
        name: LKPAN
        out_channels: 256
      Head:
        name: DBHead
        kernel_list:
        - 7
        - 2
        - 2
        k: 50
Loss:
  name: CombinedLoss
  loss_config_list:
  - DistillationDilaDBLoss:
      weight: 1.0
      model_name_pairs:
      - - Student
        - Teacher
      - - Student2
        - Teacher
      key: maps
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3
  - DistillationDMLLoss:
      model_name_pairs:
      - Student
      - Student2
      maps_name: thrink_maps
      weight: 1.0
      key: maps
  - DistillationDBLoss:
      weight: 1.0
      model_name_list:
      - Student
      - Student2
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3
Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.001
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 5.0e-05
PostProcess:
  name: DistillationDBPostProcess
  model_name:
  - Student
  key: head_out
  thresh: 0.3
  box_thresh: 0.6
  max_candidates: 1000
  unclip_ratio: 1.0
Metric:
  name: DistillationMetric
  base_metric_name: DetMetric
  main_indicator: hmean
  key: Student
Train:
  dataset:
    name: SimpleDataSet
    data_dir: data/detDataYOLOLabel/INVOICE_IMEI
    label_file_list:
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset1.txt
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset2.txt
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset3.txt
    ratio_list: [ 1.0,1.0,0.75]
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - IaaAugment:
        augmenter_args:
        - type: Fliplr
          args:
            p: 0.5
        - type: Affine
          args:
            rotate:
            - -10
            - 10
        - type: Resize
          args:
            size:
            - 0.5
            - 3
    - EastRandomCropData:
        size:
        - 640
        - 640
        max_tries: 50
        keep_ratio: true
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: 500
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: 500
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - threshold_map
        - threshold_mask
        - shrink_map
        - shrink_mask
  loader:
    shuffle: true
    drop_last: false
    batch_size_per_card: 30
    num_workers: 8
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/hoangdc/workspace/hoangdc/data/detDataYOLOLabel
    label_file_list:
      - data/detDataYOLOLabel/val.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - DetResizeForTest:
          limit_side_len: 320
          limit_type: min
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - shape
        - polys
        - ignore_tags
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 1
    num_workers: 8
profiler_options: null


Sundragon1993 avatar Jan 18 '24 03:01 Sundragon1993

I have also tried with:

- DetResizeForTest:
          limit_side_len: 960
          limit_type: max

But the error still persists.

Sundragon1993 avatar Jan 18 '24 03:01 Sundragon1993

ch_PP-OCRv4_det_cml still has bugs to solve. Please use other config temporarily.

tink2123 avatar Jan 18 '24 09:01 tink2123

@tink2123 det_cml is just teacher - student (x2) distillation (with KL loss to make the smaller students mimic the teacher output). So you can simply use det_student for baseline.

The bug has been fixed in this PR#11646