PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Seeing `ValueError: not enough values to unpack (expected 4, got 3)` in `ppocr/losses/det_db_loss.py` when fine-tuning detection model `ch_PP-OCRv3` with `ch_PP-OCRv3_det_student.yml`

Open nicolaskodak opened this issue 4 months ago • 0 comments

🔎 Search before asking

  • [X] I have searched the PaddleOCR Docs and found no similar bug report.
  • [X] I have searched the PaddleOCR Issues and found no similar bug report.
  • [X] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

To fine-tune detection model, I've created a config file (modified from ch_PP-OCRv3_det_student.yml and download ICDAR2015 following this readme.

  • a config file has been created: configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml
  • data and annotations have been placed accordingly: train_data/icdar2015/train_icdar2015_label.txt, train_data/icdar2015/ch4_training_images, train_data/icdar2015/test_icdar2015_label.txt, train_data/icdar2015/ch4_test_images.

The content of the config file is shown below:

Global:
  debug: false
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/ch_PP-OCR_V3_det_v2/
  save_epoch_step: 10
  eval_batch_step:
  - 0
  - 40 
  cal_metric_during_train: false
  pretrained_model: /data1/image/models/paddle/ch_PP-OCRv3_det_distill_train/student.pdparams
  checkpoints: null
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true

Architecture:
  model_type: det
  algorithm: DB
  Transform:
  Backbone:
    name: MobileNetV3
    scale: 0.5
    model_name: large
    disable_se: True
  Neck:
    name: RSEFPN
    out_channels: 96
    shortcut: True
  Head:
    name: DBHead
    k: 50

Loss:
  name: DBLoss
  balance_loss: true
  main_loss_type: DiceLoss
  alpha: 5
  beta: 10
  ohem_ratio: 3
Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.0001 ### 0.001 ### edited by kota
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 5.0e-05
PostProcess:
  name: DBPostProcess
  thresh: 0.3
  box_thresh: 0.6
  max_candidates: 1000
  unclip_ratio: 1.5
Metric:
  name: DetMetric
  main_indicator: hmean
Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/
    label_file_list:
      - ./train_data/icdar2015/train_icdar2015_label.txt
    ratio_list: [1.0]
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - IaaAugment:
        augmenter_args:
        - type: Fliplr
          args:
            p: 0.5
        - type: Affine
          args:
            rotate:
            - -10
            - 10
        - type: Resize
          args:
            size:
            - 0.5
            - 3
    - EastRandomCropData:
        size:
        - 960
        - 960
        max_tries: 50
        keep_ratio: true
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - threshold_map
        - threshold_mask
        - shrink_map
        - shrink_mask
  loader:
    shuffle: true
    drop_last: false
    batch_size_per_card: 4 # 8
    num_workers: 4
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/
    label_file_list:
      - ./train_data/icdar2015/test_icdar2015_label.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - DetResizeForTest: null
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image 
        - shape
        - polys
        - ignore_tags
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 1 # 1
    num_workers: 2

To execute fine-tuning, I've run

python -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml

During fine-tuning, when it reaches the 40th step, an evaluation using data from valid_dataloader) was run; however, an exception was thrown and the process terminated as below:

Traceback (most recent call last):
  File "tools/train.py", line 208, in <module>
    main(config, device, logger, vdl_writer)
  File "tools/train.py", line 180, in main
    program.train(config, train_dataloader, valid_dataloader, device, model,
  File "/home/kota/ocr/PaddleOCR/tools/program.py", line 387, in train
    cur_metric = eval(
  File "/home/kota/ocr/PaddleOCR/tools/program.py", line 548, in eval
    eval_loss = loss_class( preds, batch)['loss']
  File "/home/kota/py38ocr/lib/python3.8/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/kota/ocr/PaddleOCR/ppocr/losses/det_db_loss.py", line 58, in forward
    label_threshold_map, label_threshold_mask, label_shrink_map, label_shrink_mask = labels[
ValueError: not enough values to unpack (expected 4, got 3)

As far as I've investigated, the valid_dataloader doesn't yield labels as a tuple of 5 elements. It has 4 elements instead and the shapes of them don't seem to look like label_threshold_map, label_threshold_mask, label_shrink_map, label_shrink_mask.

Could anyone shed some light on this? Many thanks.

🏃‍♂️ Environment (运行环境)

OS

Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

Device

device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 11.7, cuDNN Version: 8.5.

paddle-related packages are shown below

numpy==1.22.0
paddle-bfloat==0.1.7
paddle-serving-app @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl
paddle-serving-client @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp38-none-any.whl
paddle-serving-server @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server-0.8.3-py3-none-any.whl
paddle2onnx==1.0.9
paddlefsl==1.1.0
paddlenlp==2.5.2
paddleocr==2.7.0.3
paddlepaddle==2.5.0
paddlepaddle-gpu==2.5.0

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

python -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml

nicolaskodak avatar Oct 14 '24 10:10 nicolaskodak