PaddleOCR
PaddleOCR copied to clipboard
中文版面分析CDLA,自己训练出来的验证集bbox ap比官方的picodet_lcnet_x1_0_fgd_layout_cdla低好几个点
中文版面分析CDLA,用官方demo,自己训练出来的bbox ap比官方的picodet_lcnet_x1_0_fgd_layout_cdla低好几个点/ Problem Description
运行环境 / Runtime Environment
- OS:linux,2-V100-32G
- Paddle:2.5.0
- PaddleOCR:2.7.0.3
没有使用FGD蒸馏训练,学习率等比率缩放了,验证集的bbox ap为0.806,请帮忙看看我的配置是否有问题,我试过num_classes: 10的,没有11类的高。另外也试过FGD蒸馏训练,也比较低。不知道哪里出了问题,是什么原因导致差了好几个点的AP。picodet_lcnet_x1_0_fgd_layout_cdla是否是不一样的配置或者不一样的数据?
第100epoch:
picodet_lcnet_x1_0_layout_cdla.yml文件:
_BASE_: [
'../../../../runtime.yml',
'../../_base_/picodet_esnet.yml',
'../../_base_/optimizer_100e.yml',
'../../_base_/picodet_640_reader.yml',
]
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_0_pretrained.pdparams
weights: output/picodet_lcnet_x1_0_layout/model_final
find_unused_parameters: True
use_ema: true
cycle_epoch: 10
snapshot_epoch: 1
epoch: 100
PicoDet:
backbone: LCNet
neck: CSPPAN
head: PicoHead
nms_cpu: True
LCNet:
scale: 1.0
feature_maps: [3, 4, 5]
metric: COCO
num_classes: 11
TrainDataset:
name: COCODataSet
image_dir: ./
anno_path: cdla_train_data/annotations.json
dataset_dir: /home/liaolinchun/dpython_work/PaddleDetection-release-2.6/tools
data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
EvalDataset:
name: COCODataSet
image_dir: ./
anno_path: cdla_train_data/annotations.json
dataset_dir: /home/liaolinchun/dpython_work/PaddleDetection-release-2.6/tools
#EvalDataset:
# name: COCODataSet
# image_dir: ./
# anno_path: cdla_val_data/annotations.json
# dataset_dir: /home/liaolinchun/dpython_work/PaddleDetection-release-2.6/tools/
TestDataset:
!ImageFolder
anno_path: /home/liaolinchun/dpython_work/PaddleDetection-release-2.6/tools/cdla_val_data/annotations.json
worker_num: 8
eval_height: &eval_height 800
eval_width: &eval_width 608
eval_size: &eval_size [*eval_height, *eval_width]
TrainReader:
sample_transforms:
- Decode: {}
- RandomCrop: {}
- RandomFlip: {prob: 0.5}
- RandomDistort: {}
batch_transforms:
- BatchRandomResize: {target_size: [[768, 576], [800, 608], [832, 640]], random_size: True, random_interp: True, keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 24
shuffle: true
drop_last: true
collate_batch: false
EvalReader:
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 8
shuffle: false
TestReader:
inputs_def:
image_shape: [1, 3, 800, 608]
sample_transforms:
- Decode: {}
- Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_transforms:
- PadBatch: {pad_to_stride: 32}
batch_size: 1
shuffle: false
optimizer_100e.yml文件:
epoch: 100
LearningRate:
base_lr: 0.1
schedulers:
- name: CosineDecay
max_epochs: 100
- name: LinearWarmup
start_factor: 0.1
steps: 300
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.00004
type: L2