PaddleOCR
PaddleOCR copied to clipboard
best metric, acc: 0.0 on recognition
The training seem to be ok. The final epoch acc: 0.988281 , but the best metric, acc: 0.0 What's wrong with that? I used my custom dataset and I adjust dict.txt
- System Environment:google colab This is my yml file
use_gpu: True
epoch_num: 100
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/ic15/
save_epoch_step: 3
# evaluation is run every 2000 iterations
eval_batch_step: [0, 2000]
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir: ./
use_visualdl: False
infer_img: content/drive/MyDrive/PaddleOCr/gen2011/test/sample_301.png
# for data or label process
character_dict_path: ppocr/utils/ic15_dict.txt
max_text_length: 100
infer_mode: False
use_space_char: False
save_res_path: ./output/rec/predicts_ic15.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.001
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
model_type: rec
algorithm: CRNN
Transform:
Backbone:
name: MobileNetV3
scale: 0.5
model_name: large
Neck:
name: SequenceEncoder
encoder_type: rnn
hidden_size: 96
Head:
name: CTCHead
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/custom_dataset/train/
label_file_list: ["./train_data/custom_dataset/rec_gt_train.txt"]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 320]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: True
batch_size_per_card: 256
drop_last: True
num_workers: 8
use_shared_memory: False
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/custom_dataset/test
label_file_list: ["./train_data/custom_dataset/rec_gt_test.txt"]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [3, 32, 320]
- KeepKeys:
keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 256
num_workers: 4
use_shared_memory: False
and I ran this
and I got this result
is the dict.txt you adjust the same as the txt in your command with "Global.character_dict_path=ppocr/utils/ic15_dict.txt"?
@drenched9 I add ic15_dict.txt with more english character.
@Yosiiiiiiiiiiiiiiii Hi man, I have a question did you disable RecConAug on purpose? and what was the training data size you used for training? I can see the training accuracy of your model doing pretty well. I'm training my model on a large 9M dataset which usually gets high accuracy in Training using Models like vanilla CRNN or SAR (above 95%) but when using PPOCRv3 the accuracy drops to 83%.
hi @bely66
"RecConAug on purpose?" >> I didn't do anything. I was training on CRNN, following rec_icdar15_train.yml "I can see the training accuracy of your model doing pretty well." >> no it was not >> it is overfitting so i add 64k sample on training set and the accuracy was above 98%. I can't make it train on PPOCRV3. How did you do that? I posted my issue here: https://github.com/PaddlePaddle/PaddleOCR/issues/8178 please help if you can ^^
@Yosiiiiiiiiiiiiiiii The model is seriously overfitting. It is recommended to try to load the pre-trained model and reduce the learning rate( Try reducing it to 0.0001)