PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Custom dataset trained Mobile Detection Model v3 0 accuracy

Open saylan21 opened this issue 2 years ago • 6 comments

Hello everyone, I am trying to train ch_PP-OCRv3_det_slim model that shared on the Model List Page as the config file i am using https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml the default .yaml file only changing the `pretrained_model as the weights i downloaded from model list v3 page and using a custom dataset which has no problem while training recognation model. Currently the model i downloaded from model list page, is working with 70 percent accuracy on my dataset, i would to increase the accuracy by training on this model weights with my own dataset but the training is ending up with 0 accuracy. Could you tell me how to train this ch_PP-OCRv3_det_slim model with my own dataset and using the pretrained model on the model list page. Thank you :)

saylan21 avatar Sep 09 '22 14:09 saylan21

if you want to finetune your own model, you should use this model ch_PP-OCRv3_det not slim model. also you should use detection datasets not recognition dataset. https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/dataset/ocr_datasets_en.md#1-text-detection

andyjiang1116 avatar Sep 13 '22 03:09 andyjiang1116

Hi thank you for the answer, I am already using the detection dataset which I created using PPOCRLabel tool. The model will run on a mobile device so I think it would be better to fine tune slim model. Is it possible to fine-tune slim model ?

saylan21 avatar Sep 13 '22 07:09 saylan21

It needs 2 step first you finetune the base model (not slim) second you slim the model according this doc https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/deploy/slim/quantization/README_en.md#3-quant-aware-training

andyjiang1116 avatar Sep 13 '22 09:09 andyjiang1116

Thank you for the answer @andyjpaddle . 1-) I am trying to fine-tune base model ch_PP-OCRv3_det using ch_PP-OCRv3_det_cml.yml config. Is it enough to change the dataset folder and the pretrained weight for the Teacher model, Do i need to modify anything for the student models. And in the second step should i quantize the teacher model or the student model ?

2-) i already trained student model and run it on a mobile device but the model didn't give any output for the image size 150x340 then i upscale the images to 300x680 then it started to give output for a reason that i do not know ? So Is it possible to fine tune the model in a way that it runs without needing to be upscaled ?

saylan21 avatar Oct 07 '22 15:10 saylan21

  1. it depends on what your application scenarios are, we provide 3 fine-tune methods, but the english doc is not ready, we'll update later, you can refer the Chinese doc https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/PPOCRv3_det_train.md#3-%E5%9F%BA%E4%BA%8Eppocrv3%E6%A3%80%E6%B5%8Bfinetune%E8%AE%AD%E7%BB%83 Only change the dataset folder and the pretrained weight is enough for cml method. The teacher model usually has a big model size, student model is little. you can choose that you need.
  2. the input image size should be same with your training shape

andyjiang1116 avatar Oct 08 '22 08:10 andyjiang1116

@andyjpaddle thank you for the answer 👍 i find it out that my custom trained inference model is working better with a size multiple of 32. I also see this in the line 264 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppocr/data/imaug/operators.py , Could you please tell me 1-) why it is required to resize my image to a multiple of 32 ? 2-) why It works better the more i upscale my image size ? For example it works better when i upscale my image size to 320 instead of 240 both of them are multiple of 32 ?

Thank you

saylan21 avatar Oct 11 '22 09:10 saylan21

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 07 '23 08:07 github-actions[bot]