PaddleOCR2Pytorch icon indicating copy to clipboard operation
PaddleOCR2Pytorch copied to clipboard

en_PP-OCRv3_rec convert/inference error

Open gliufetch opened this issue 2 years ago • 1 comments

all this is about en_paddleOcr_v3_rec model:

  1. convert model (https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar)
(.venv) ➜  PaddleOCR2Pytorch git:(test) ✗ python3 ./converter/multilingual_ppocr_v3_rec_converter.py --src_model_path en_PP-OCRv3_rec_train
out_channels:  97
<class 'dict'> {}
{'fc_decay': 2e-05, 'in_channels': 64}
model is loaded.
en_PP-OCRv3_rec_train/best_accuracy
out: 40.0 0.010309278 0.9644629 2.816156e-07
model is saved: en_ptocr_v3_rec_infer.pth
done.

Question 1, which converter.py should be used when converting en_PP-OCRv3_rec_train model, multilingual_ppocr_v3_rec_converter.py or just ch_ppocr_v3_rec_converter.py?

  1. inference
(.venv) ➜  PaddleOCR2Pytorch git:(test) ✗ python3 ./tools/infer/predict_rec.py --rec_model_path en_ptocr_v3_rec_infer.pth --rec_image_shape 3,48,320 --image_dir ./doc/imgs_words/en/word_1.png  --rec_yaml_path en_PP-OCRv3_rec_v2.5.yml  --rec_char_dict_path en_dict.txt  --image_dir ./doc/imgs_words/en/word_1.png 
Traceback (most recent call last):
  File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 463, in <module>
    main(utility.parse_args())
  File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 437, in main
    text_recognizer = TextRecognizer(args)
  File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 90, in __init__
    super(TextRecognizer, self).__init__(network_config, **kwargs)
  File "/**/PaddleOCR2Pytorch/pytorchocr/base_ocr_v20.py", line 13, in __init__
    self.build_net(**kwargs)
  File "/**/PaddleOCR2Pytorch/pytorchocr/base_ocr_v20.py", line 18, in build_net
    self.net = BaseModel(self.config, **kwargs)
  File "/**/PaddleOCR2Pytorch/pytorchocr/modeling/architectures/base_model.py", line 63, in __init__
    self.head = build_head(config["Head"], **kwargs)
  File "/**/PaddleOCR2Pytorch/pytorchocr/modeling/heads/__init__.py", line 47, in build_head
    assert module_name in support_dict, Exception('head only support {}'.format(
AssertionError: head only support ['DBHead', 'PSEHead', 'EASTHead', 'SASTHead', 'CTCHead', 'ClsHead', 'AttentionHead', 'SRNHead', 'PGHead', 'Transformer', 'TableAttentionHead', 'SARHead', 'FCEHead', 'CANHead']

Question 1: is it because (https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml)

  Head:
    name: MultiHead

for en_ppocrv3_rec?

Question 2: does PaddleOCR2Pytorch support english PaddleOcrV3 rec model that has SVTR_LCNet algorithm (like the one in release 2.7)

Architecture:
  model_type: rec
  algorithm: SVTR_LCNet
  for en_ppocrv3_rec?

gliufetch avatar Dec 22 '23 04:12 gliufetch

figured it out,

  1. model conversion should use ./converter/multilingual_ppocr_v3_rec_converter.py to convert the en recognition model

  2. the yml file is not correct when doing prediction, should use https://github.com/frotms/PaddleOCR2Pytorch/blob/main/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml for prediction. (this file is different from the 'default' one in paddleocr.

Thank you for the work!

gliufetch avatar Dec 27 '23 15:12 gliufetch