PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PP-OCRv3 recognition model training

Open bely66 opened this issue 2 years ago • 3 comments

I'm training the Recognition Model with 9M images dataset. The dataset includes many augmentations, words and fonts. I was reading the paper and found out that the training epochs for the recognition model was 700 epochs after warmup. I'm getting 80% accuracy after training for 5 epochs but I can see the accuracy isn't changing much after that.

So here are my questions:

  1. is it possible to train for a lower number of epochs?
  2. what's the hardware required to train such a model in a reasonable time frame (week at most)? and how long did it take you to train it?
  3. In case the training time takes that much, what's the best way to iterate over the results because it will be hard to train for two weeks for a minor update?

bely66 avatar Oct 22 '22 09:10 bely66

@an1018 Any idea how can I do that?

bely66 avatar Oct 23 '22 19:10 bely66

Hi, bely66 Congratulation, 80% accuracy is already a good model! Regarding your question, here are a few suggestions for reference:

  1. Yes, if you train your own data with a pretrained model, you don't need to iterate so long at all. Maybe 100 or 200 epoch is ok.
  2. It is recommended to use 4 cards V100, about 3 days of training is enough.
  3. When fine-tuning, it is recommended to use a small learning rate, shorten the evaluation frequency, and obtain the optimal model as soon as possible.

tink2123 avatar Oct 24 '22 07:10 tink2123

Hi, bely66

Congratulation, 80% accuracy is already a good model! Regarding your question, here are a few suggestions for reference:

  1. Yes, if you train your own data with a pretrained model, you don't need to iterate so long at all. Maybe 100 or 200 epoch is ok.

  2. It is recommended to use 4 cards V100, about 3 days of training is enough.

  3. When fine-tuning, it is recommended to use a small learning rate, shorten the evaluation frequency, and obtain the optimal model as soon as possible.

Thanks @tink2123 for your reply

My case is text from documents with a very straight forward shape so I'm getting above 94% accuracy with simple models like CRNN and above 96% with SAR

I suppose that the ppocrv3 is better than those two models in terms of accuracy?

bely66 avatar Oct 24 '22 10:10 bely66

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 08 '23 02:07 github-actions[bot]