PaddleOCR
PaddleOCR copied to clipboard
wrong result in RFL trained model
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
-
系统环境/System Environment:win11 intel 12700 rtx3070
-
版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:paddleocr 2.7.0.2 paddlepaddle 2.5.2 paddlepaddle-gpu 2.5.2
-
运行指令/Command Code:python tools/infer_rec.py -c train/namesVnew2.yml -o Global.pretrained_model=output/namesV3new/latest Global.infer_img=dataset-kartmelly-eval/img14.png char_dict_path=train/namesV2.txt
-
完整报错/Complete Error Message: the problem is I am getting the wrong result. i have trained an RFL model from scratch using an example yml file in the repo. the model converges nicely; but at the end, i am getting a completely wrong result (a number) that is not even in the char dict. i tried setting dict char manually and tried othe versions and no luck. i think i am doing an obvious mistake.
yml:
Global:
debug: false
use_gpu: true
epoch_num: 10
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/namesV3new
save_epoch_step: 25
eval_batch_step: 50
cal_metric_during_train: true
pretrained_model: ./output/namesV3newcp/best_accuracy
checkpoints: null
save_inference_dir: ./output/namesV3_inference
use_visualdl: false
infer_img: ./doc/imgs_words/arabic/ar_2.jpg
character_dict_path: ./train/namesV2.txt
max_text_length: 50
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/names/predicts_ppocrv3.txt
Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
weight_decay: 0.0
clip_norm_global: 5.0
lr:
name: Piecewise
decay_epochs:
- 3
- 5
- 7
- 9
values:
- 9.0e-05
- 2.7e-05
- 8.1e-06
- 2.43e-06
- 7.29e-07
Architecture:
model_type: rec
algorithm: RFL
in_channels: 1
Transform:
name: TPS
num_fiducial: 20
loc_lr: 1.0
model_name: large
Backbone:
name: ResNetRFL
use_cnt: true
use_seq: false
Neck:
name: RFAdaptor
use_v2s: false
use_s2v: false
Head:
name: RFLHead
in_channels: 512
hidden_size: 256
batch_max_legnth: 25
out_channels: 38
use_cnt: true
use_seq: false
Loss:
name: RFLLoss
PostProcess:
name: RFLLabelDecode
Metric:
name: CNTMetric
main_indicator: acc
Train:
dataset:
name: SimpleDataSet
data_dir: C:\Users\amm\PycharmProjects\DataGeneration\
ext_op_transform_idx: 1
label_file_list:
- C:\Users\amm\PycharmProjects\DataGeneration\words-train\dataset.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RFLLabelEncode: null
- RFLRecResizeImg:
image_shape:
- 1
- 32
- 100
padding: false
interpolation: 2
- KeepKeys:
keep_keys:
- image
- label
- length
- cnt_label
loader:
shuffle: true
batch_size_per_card: 300
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: C:\Users\amm\PycharmProjects\DataGeneration\
label_file_list:
- C:\Users\amm\PycharmProjects\DataGeneration\words-eval\dataset.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RFLLabelEncode: null
- RFLRecResizeImg:
image_shape:
- 1
- 32
- 100
padding: false
interpolation: 2
- KeepKeys:
keep_keys:
- image
- label
- length
- cnt_label
loader:
shuffle: true
batch_size_per_card: 300
drop_last: true
num_workers: 8
profiler_options: null
cmd:
python tools/infer_rec.py -c train/namesVnew2.yml -o Global.pretrained_model=output/namesV3new/latest Global.infer_img=dataset-kartmelly-eval/img1.jpg char_dict_path=train/namesV2.txt
[2024/04/22 16:04:57] ppocr INFO: Architecture :
[2024/04/22 16:04:57] ppocr INFO: Backbone :
[2024/04/22 16:04:57] ppocr INFO: name : ResNetRFL
[2024/04/22 16:04:57] ppocr INFO: use_cnt : True
[2024/04/22 16:04:57] ppocr INFO: use_seq : False
[2024/04/22 16:04:57] ppocr INFO: Head :
[2024/04/22 16:04:57] ppocr INFO: batch_max_legnth : 25
[2024/04/22 16:04:57] ppocr INFO: hidden_size : 256
[2024/04/22 16:04:57] ppocr INFO: in_channels : 512
[2024/04/22 16:04:57] ppocr INFO: name : RFLHead
[2024/04/22 16:04:57] ppocr INFO: out_channels : 38
[2024/04/22 16:04:57] ppocr INFO: use_cnt : True
[2024/04/22 16:04:57] ppocr INFO: use_seq : False
[2024/04/22 16:04:57] ppocr INFO: Neck :
[2024/04/22 16:04:57] ppocr INFO: name : RFAdaptor
[2024/04/22 16:04:57] ppocr INFO: use_s2v : False
[2024/04/22 16:04:57] ppocr INFO: use_v2s : False
[2024/04/22 16:04:57] ppocr INFO: Transform :
[2024/04/22 16:04:57] ppocr INFO: loc_lr : 1.0
[2024/04/22 16:04:57] ppocr INFO: model_name : large
[2024/04/22 16:04:57] ppocr INFO: name : TPS
[2024/04/22 16:04:57] ppocr INFO: num_fiducial : 20
[2024/04/22 16:04:57] ppocr INFO: algorithm : RFL
[2024/04/22 16:04:57] ppocr INFO: in_channels : 1
[2024/04/22 16:04:57] ppocr INFO: model_type : rec
[2024/04/22 16:04:57] ppocr INFO: Eval :
[2024/04/22 16:04:57] ppocr INFO: dataset :
[2024/04/22 16:04:57] ppocr INFO: data_dir : C:\Users\amm\PycharmProjects\DataGeneration\
[2024/04/22 16:04:57] ppocr INFO: label_file_list : ['C:\\Users\\amm\\PycharmProjects\\DataGeneration\\words-eval\\dataset.txt']
[2024/04/22 16:04:57] ppocr INFO: name : SimpleDataSet
[2024/04/22 16:04:57] ppocr INFO: transforms :
[2024/04/22 16:04:57] ppocr INFO: DecodeImage :
[2024/04/22 16:04:57] ppocr INFO: channel_first : False
[2024/04/22 16:04:57] ppocr INFO: img_mode : BGR
[2024/04/22 16:04:57] ppocr INFO: RFLLabelEncode : None
[2024/04/22 16:04:57] ppocr INFO: RFLRecResizeImg :
[2024/04/22 16:04:57] ppocr INFO: image_shape : [1, 32, 100]
[2024/04/22 16:04:57] ppocr INFO: interpolation : 2
[2024/04/22 16:04:57] ppocr INFO: padding : False
[2024/04/22 16:04:57] ppocr INFO: KeepKeys :
[2024/04/22 16:04:57] ppocr INFO: keep_keys : ['image', 'label', 'length', 'cnt_label']
[2024/04/22 16:04:57] ppocr INFO: loader :
[2024/04/22 16:04:57] ppocr INFO: batch_size_per_card : 300
[2024/04/22 16:04:57] ppocr INFO: drop_last : True
[2024/04/22 16:04:57] ppocr INFO: num_workers : 8
[2024/04/22 16:04:57] ppocr INFO: shuffle : True
[2024/04/22 16:04:57] ppocr INFO: Global :
[2024/04/22 16:04:57] ppocr INFO: cal_metric_during_train : True
[2024/04/22 16:04:57] ppocr INFO: character_dict_path : ./train/namesV2.txt
[2024/04/22 16:04:57] ppocr INFO: checkpoints : None
[2024/04/22 16:04:57] ppocr INFO: debug : False
[2024/04/22 16:04:57] ppocr INFO: distributed : False
[2024/04/22 16:04:57] ppocr INFO: epoch_num : 10
[2024/04/22 16:04:57] ppocr INFO: eval_batch_step : 50
[2024/04/22 16:04:57] ppocr INFO: infer_img : dataset-kartmelly-eval/img1.jpg
[2024/04/22 16:04:57] ppocr INFO: infer_mode : True
[2024/04/22 16:04:57] ppocr INFO: log_smooth_window : 20
[2024/04/22 16:04:57] ppocr INFO: max_text_length : 50
[2024/04/22 16:04:57] ppocr INFO: pretrained_model : output/namesV3new/latest
[2024/04/22 16:04:57] ppocr INFO: print_batch_step : 10
[2024/04/22 16:04:57] ppocr INFO: save_epoch_step : 25
[2024/04/22 16:04:57] ppocr INFO: save_inference_dir : ./output/namesV3_inference
[2024/04/22 16:04:57] ppocr INFO: save_model_dir : ./output/namesV3new
[2024/04/22 16:04:57] ppocr INFO: save_res_path : ./output/names/predicts_ppocrv3.txt
[2024/04/22 16:04:57] ppocr INFO: use_gpu : True
[2024/04/22 16:04:57] ppocr INFO: use_space_char : True
[2024/04/22 16:04:57] ppocr INFO: use_visualdl : False
[2024/04/22 16:04:57] ppocr INFO: Loss :
[2024/04/22 16:04:57] ppocr INFO: name : RFLLoss
[2024/04/22 16:04:57] ppocr INFO: Metric :
[2024/04/22 16:04:57] ppocr INFO: main_indicator : acc
[2024/04/22 16:04:57] ppocr INFO: name : CNTMetric
[2024/04/22 16:04:57] ppocr INFO: Optimizer :
[2024/04/22 16:04:57] ppocr INFO: beta1 : 0.9
[2024/04/22 16:04:57] ppocr INFO: beta2 : 0.999
[2024/04/22 16:04:57] ppocr INFO: clip_norm_global : 5.0
[2024/04/22 16:04:57] ppocr INFO: lr :
[2024/04/22 16:04:57] ppocr INFO: decay_epochs : [3, 5, 7, 9]
[2024/04/22 16:04:57] ppocr INFO: name : Piecewise
[2024/04/22 16:04:57] ppocr INFO: values : [9e-05, 2.7e-05, 8.1e-06, 2.43e-06, 7.29e-07]
[2024/04/22 16:04:57] ppocr INFO: name : AdamW
[2024/04/22 16:04:57] ppocr INFO: weight_decay : 0.0
[2024/04/22 16:04:57] ppocr INFO: PostProcess :
[2024/04/22 16:04:57] ppocr INFO: name : RFLLabelDecode
[2024/04/22 16:04:57] ppocr INFO: Train :
[2024/04/22 16:04:57] ppocr INFO: dataset :
[2024/04/22 16:04:57] ppocr INFO: data_dir : C:\Users\amm\PycharmProjects\DataGeneration\
[2024/04/22 16:04:57] ppocr INFO: ext_op_transform_idx : 1
[2024/04/22 16:04:57] ppocr INFO: label_file_list : ['C:\\Users\\amm\\PycharmProjects\\DataGeneration\\words-train\\dataset.txt']
[2024/04/22 16:04:57] ppocr INFO: name : SimpleDataSet
[2024/04/22 16:04:57] ppocr INFO: transforms :
[2024/04/22 16:04:57] ppocr INFO: DecodeImage :
[2024/04/22 16:04:57] ppocr INFO: channel_first : False
[2024/04/22 16:04:57] ppocr INFO: img_mode : BGR
[2024/04/22 16:04:57] ppocr INFO: RFLLabelEncode : None
[2024/04/22 16:04:57] ppocr INFO: RFLRecResizeImg :
[2024/04/22 16:04:57] ppocr INFO: image_shape : [1, 32, 100]
[2024/04/22 16:04:57] ppocr INFO: interpolation : 2
[2024/04/22 16:04:57] ppocr INFO: padding : False
[2024/04/22 16:04:57] ppocr INFO: KeepKeys :
[2024/04/22 16:04:57] ppocr INFO: keep_keys : ['image', 'label', 'length', 'cnt_label']
[2024/04/22 16:04:57] ppocr INFO: loader :
[2024/04/22 16:04:57] ppocr INFO: batch_size_per_card : 300
[2024/04/22 16:04:57] ppocr INFO: drop_last : True
[2024/04/22 16:04:57] ppocr INFO: num_workers : 8
[2024/04/22 16:04:57] ppocr INFO: shuffle : True
[2024/04/22 16:04:57] ppocr INFO: char_dict_path : train/namesV2.txt
[2024/04/22 16:04:57] ppocr INFO: profiler_options : None
[2024/04/22 16:04:57] ppocr INFO: train with paddle 2.5.2 and device Place(gpu:0)
W0422 16:04:57.521317 10844 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.4, Runtime API Version: 11.8
W0422 16:04:57.536936 10844 gpu_resources.cc:149] device: 0, cuDNN Version: 8.6.
[2024/04/22 16:04:57] ppocr INFO: load pretrain successful from output/namesV3new/latest
[2024/04/22 16:04:57] ppocr INFO: infer_img: dataset-kartmelly-eval/img1.jpg
[2024/04/22 16:04:58] ppocr INFO: result: 16
[2024/04/22 16:04:58] ppocr INFO: success!
dict char: namesV2.txt
What about the contents of the files under the path ./output/names/predicts_ppocrv3.txt?
What about the contents of the files under the path ./output/names/predicts_ppocrv3.txt?
the same wrong result.
python tools/infer_rec.py -c train/namesVnew2.yml -o Global.checkpoints=output/namesV3new/latest Global.infer_img=dataset-kartmelly-eval/img14.png char_dict_path=train/namesV2.txt try this command, set checkpoints to your model, not the pretrained_model