en_PP-OCRv3_rec convert/inference error
all this is about en_paddleOcr_v3_rec model:
- convert model (https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar)
(.venv) ➜ PaddleOCR2Pytorch git:(test) ✗ python3 ./converter/multilingual_ppocr_v3_rec_converter.py --src_model_path en_PP-OCRv3_rec_train
out_channels: 97
<class 'dict'> {}
{'fc_decay': 2e-05, 'in_channels': 64}
model is loaded.
en_PP-OCRv3_rec_train/best_accuracy
out: 40.0 0.010309278 0.9644629 2.816156e-07
model is saved: en_ptocr_v3_rec_infer.pth
done.
Question 1, which converter.py should be used when converting en_PP-OCRv3_rec_train model, multilingual_ppocr_v3_rec_converter.py or just ch_ppocr_v3_rec_converter.py?
- inference
(.venv) ➜ PaddleOCR2Pytorch git:(test) ✗ python3 ./tools/infer/predict_rec.py --rec_model_path en_ptocr_v3_rec_infer.pth --rec_image_shape 3,48,320 --image_dir ./doc/imgs_words/en/word_1.png --rec_yaml_path en_PP-OCRv3_rec_v2.5.yml --rec_char_dict_path en_dict.txt --image_dir ./doc/imgs_words/en/word_1.png
Traceback (most recent call last):
File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 463, in <module>
main(utility.parse_args())
File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 437, in main
text_recognizer = TextRecognizer(args)
File "/**/PaddleOCR2Pytorch/./tools/infer/predict_rec.py", line 90, in __init__
super(TextRecognizer, self).__init__(network_config, **kwargs)
File "/**/PaddleOCR2Pytorch/pytorchocr/base_ocr_v20.py", line 13, in __init__
self.build_net(**kwargs)
File "/**/PaddleOCR2Pytorch/pytorchocr/base_ocr_v20.py", line 18, in build_net
self.net = BaseModel(self.config, **kwargs)
File "/**/PaddleOCR2Pytorch/pytorchocr/modeling/architectures/base_model.py", line 63, in __init__
self.head = build_head(config["Head"], **kwargs)
File "/**/PaddleOCR2Pytorch/pytorchocr/modeling/heads/__init__.py", line 47, in build_head
assert module_name in support_dict, Exception('head only support {}'.format(
AssertionError: head only support ['DBHead', 'PSEHead', 'EASTHead', 'SASTHead', 'CTCHead', 'ClsHead', 'AttentionHead', 'SRNHead', 'PGHead', 'Transformer', 'TableAttentionHead', 'SARHead', 'FCEHead', 'CANHead']
Question 1: is it because (https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml)
Head:
name: MultiHead
for en_ppocrv3_rec?
Question 2: does PaddleOCR2Pytorch support english PaddleOcrV3 rec model that has SVTR_LCNet algorithm (like the one in release 2.7)
Architecture:
model_type: rec
algorithm: SVTR_LCNet
for en_ppocrv3_rec?
figured it out,
-
model conversion should use ./converter/multilingual_ppocr_v3_rec_converter.py to convert the en recognition model
-
the yml file is not correct when doing prediction, should use https://github.com/frotms/PaddleOCR2Pytorch/blob/main/configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml for prediction. (this file is different from the 'default' one in paddleocr.
Thank you for the work!