EasyOCR icon indicating copy to clipboard operation
EasyOCR copied to clipboard

Train my own recognition model

Open stealth0414 opened this issue 1 year ago • 3 comments

i want to use english_g2.pth to train my own recognition model,is there any Tutorial? The deep-text-recognition-benchmark model looks like 200MB,It's a little big for me,thanks

stealth0414 avatar May 09 '23 06:05 stealth0414

You may use this doc provided by author for your reference : https://github.com/JaidedAI/EasyOCR/blob/master/custom_model.md

MdotO avatar May 10 '23 03:05 MdotO

Thank you, I have successfully done this part of the content. My datasets is mainly printing character. What's the size of datasets do you suggest to have better model?

stealth0414 avatar May 12 '23 08:05 stealth0414

The larger is typically better, as long as the dataset is diverse. When you have a lot of data samples, adding more will not be as impactful anymore, but this likely requires 10k plus images. If you want to quickly generate a large dataset, you can synthetically generate one, which I have written about in TowardsAI here https://pub.towardsai.net/how-to-make-a-synthesized-dataset-to-fine-tune-your-ocr-3573f1a7e08b if that is of interest. I hope this helps!

EivindKjosbakken avatar Jan 25 '24 18:01 EivindKjosbakken